pgloader cannot import while using TARGET COLUMNS - postgresql

I am having a hard time getting pgloader to work while trying to use the TARGET COLUMNS optional arguments.
LOAD CSV
FROM INLINE
HAVING FIELDS
(
npi,
...
)
INTO postgresql://user:pass!n#pg2/nadb?tablename=tempload
(
npi
)
WITH skip header = 1,
fields optionally enclosed by '"',
fields escaped by double-quote,
fields terminated by ','
SET work_mem to '64MB'
BEFORE LOAD EXECUTE
tempload.sql;
If I don't use the target columns then it works just fine. tempload has the exact same columns as data.csv.
Everytime i run it it hangs up at this point:
2016-06-09T17:17:33.749000-05:00 DEBUG
select i.relname,
n.nspname,
indrelid::regclass,
indrelid,
indisprimary,
indisunique,
pg_get_indexdef(indexrelid),
c.conname,
pg_get_constraintdef(c.oid)
from pg_index x
join pg_class i ON i.oid = x.indexrelid
join pg_namespace n ON n.oid = i.relnamespace
left join pg_constraint c ON c.conindid = i.oid
where indrelid = 'tempload'::regclass
I'm at a total loss. Like I said, it works fine if I don't use TARGET COLUMNS, so I really don't believe it is the data.
I get the same thing with release 3.2 and the docker image.

Turns out the issue has to do with the amount of memory. I changed to SET work_mem = '512' and it started to get past that point. I guess this has to do with the face that I have 330 columns to import.

Related

How to read / list security labels on columns in postgreSQL

I've set up PostgreSQL Anonymizer on my database with security labels and everything works fine.
I'm trying to regularly ceck if there is missing security labels on the columns of my database to telle the developers to add them in the next release but I can't fin a way to read the security labels.
Can anyone know how to do this ?
EDIT on 10/11/2022
Thanks to #Shiva, I've end up doing this query :
select cl."oid", col.ordinal_position, col.table_schema, col.table_name, col.column_name
FROM information_schema.columns col
join pg_catalog.pg_class cl on cl.relname = col.table_name
WHERE col.table_schema = 'XXXX'
and not exists (select objoid FROM pg_seclabel where provider = 'anon' and objsubid = col.ordinal_position and objoid = cl."oid");
You have to query pg_seclabel catalog to get list of security labels.
SELECT objsubid, provider, label FROM pg_seclabel WHERE objoid::regclass = 'mytable'::regclass
objsubid is the column number whose corresponding column name can be found by querying information_schema.columns catalog.
SELECT column_name FROM information_schema.columns WHERE table_name = 'mytable' AND ordinal_position = <column_number>
You can combine the above two queries to find columns that do not have the required security labels.

oracle merge query in postgres

I have this merge query in oracle and it was working fine. Now we are migrating to postgres 10 and trying to find equivalent for this in postgres.
MERGE INTO s.act_pack C USING((SELECT A.jid, A.pid, B.pcode,
B.mc, A.md, A.hd FROM s.act_pack A INNER JOIN s.act_pack B
ON A.pid = B.pid AND A.pcode = B.mc AND (A.hd <> B.hd
OR A.md<> B.md)) order by A.upd_ts desc) D ON(C.pid = D.pid AND
C.pcode = D.pcode AND C.jid = D.jid) WHEN MATCHED THEN UPDATE SET C.md =
D.md, C.hd= D.hd;
I see some forums on web says postgres doesnt support merge, and use INSERT ... ON CONFLICT
but with no background in postgres, I am not able to understand how this complex query can be written using that.
And some says postgres9.5 and above support merge statement. since we are using postgres 10 tried to use same oracle query in postgres but recieved ERROR: syntax error at or near "MERGE"
Any help is highly appreciated.
You don't need an "UPSERT" as you are not doing an INSERT, so a regular UPDATE is enough:
update act_pack C
SET C.md = D.md,
C.hd = D.h
from (
SELECT A.jid, A.pid, B.pcode, B.mc, A.md, A.hd
FROM s.act_pack A
INNER JOIN s.act_pack B
ON A.pid = B.pid
AND A.pcode = B.mc
AND (A.hd <> B.hd OR A.md<> B.md)
) d
where C.pid = D.pid
AND C.pcode = D.pcode
AND C.jid = D.jid
This is a direct "translation" of your code. But the fact that the same table is used three times is a bit strange. But without more information it's hard to know where exactly this could be made more efficient.

The multi-part identifier could not be bound - SQL Server 2016

SELECT clm.CLCL_PAYEE_PR_ID, clm.SBSB_CK, clm.CLCL_ID, clm.clcl_id_adj_to,clm.clcl_id_adj_from, clm.CLCL_PAID_DT
FROM ODW.DW.fac_cmc_clcl_claim CLM
INNER JOIN ODW.DW.fac_cmc_meme_member MEME ON MEME.meme_ck = CLM.meme_ck
INNER JOIN ODW.DW.fac_cmc_mepe_prcs_elig MEPE ON MEPE.meme_ck = MEME.meme_ck
INNER JOIN ODW.DW.fac_cmc_mepr_prim_prov MEPR ON MEPE.meme_ck = MEPR.meme_ck AND CLM.clcl_prpr_id_pcp = MEPR.prpr_id
INNER JOIN ODW.DW.fac_cmc_sbsb_subsc SBSB ON MEME.sbsb_ck = SBSB.sbsb_ck
INNER JOIN ODW.DW.fac_cmc_prpr_prov PROV ON MEPR.prpr_id = PROV.prpr_id AND PROV.prpr_mctr_prty = 'RISK'
INNER JOIN ODW.DW.fac_cmc_prer_relation PRER ON PRER.prpr_id = MEPR.prpr_id
INNER JOIN ODW.DW.fac_cmc_plds_plan_desc PLDS ON MEPE.cspi_id = PLDS.cspi_id
INNER JOIN ODW.DW.fac_cmc_pdds_prod_desc PDDS ON MEPE.pdpd_id = PDDS.pdpd_id
WHERE CLM.clcl_paid_dt BETWEEN '2019-12-24 00:00:00.000' AND '2019-12-30 23:59:59.997'
AND CLM.clcl_cur_sts = '02'
AND CLM.clcl_cl_type = 'M'
AND CLM.clcl_cl_sub_type = 'H'
AND CLM.grgr_ck IN (46)
AND MEPR.grgr_ck IN (46)
AND MEPE.grgr_ck IN (46)
AND MEPE.mepe_elig_ind = 'Y'
AND CLM.clcl_low_svc_dt BETWEEN MEPE.mepe_eff_dt AND MEPE.mepe_term_dt
AND CLM.clcl_low_svc_dt BETWEEN MEPR.mepr_eff_dt AND MEPR.mepr_term_dt
AND SBSB.grgr_ck IN (46)
AND PRER.prer_prpr_entity = 'I'
AND PRER.prer_prpr_id IN ('64456546')
AND (PLDS.plds_desc LIKE '%risk%' OR PDDS.pdds_desc LIKE '%risk%');
This query runs in PROD with different variables which substitute the value of the hard coded values. It runs around 100 times per day in PROD and on some days some of the runs fail due to this error:
The multi-part identifier "PDDS.pdds_desc" could not be bound
Please note that all the joins are being done on views.
When I re-run the failed process, it succeeds the second time with no changes to the underlying query.
Can anyone suggest what could be the issue. Also, any performance optimization suggestions for this query query will be appreciated.
Thanks!

Delete using left outer join in Postgres

I am switching a database from MySQL to Postgres SQL. A select query that worked in MySQL works in Postgres but a similar delete query does not.
I have two tables of data which list where certain back-up files are located. Existing data (ed) and new data (nd). This syntax will pick out existing data which might state where a file is located in the existing data table, matching it against equal filename and path, but no information as to where it is located in the new data:
SELECT ed.id, ed.file_name, ed.cd_name, ed.path, nd.cd_name
FROM tv_episodes AS ed
LEFT OUTER JOIN data AS nd ON
ed.file_name = nd.file_name AND
ed.path = nd.path
WHERE ed.cd_name = 'MediaLibraryDrive' AND nd.cd_name IS NULL;
I wish to run a delete query using this syntax:
DELETE ed
FROM tv_episodes AS ed
LEFT OUTER JOIN data AS nd ON
ed.file_name = nd.file_name AND
ed.path = nd.path
WHERE ed.cd_name = 'MediaLibraryDrive' AND nd.cd_name IS NULL;
I have tried DELETE ed and DELETE ed.* both of which render syntax error at or near "ed". Similar errors if I try without the alias of ed. If I attempt
DELETE FROM tv_episodes AS ed
LEFT JOIN data AS nd.....
Postgres sends back syntax error at or near "LEFT".
I'm stumped and can't find much on delete queries using joins specific to psql.
As others have noted, you can't LEFT JOIN directly in a DELETE statement. You can, however, self join on a primary key to the target table with a USING statement, then left join against that self-joined table.
DELETE FROM tv_episodes
USING tv_episodes AS ed
LEFT OUTER JOIN data AS nd ON
ed.file_name = nd.file_name AND
ed.path = nd.path
WHERE
tv_episodes.id = ed.id AND
ed.cd_name = 'MediaLibraryDrive' AND nd.cd_name IS NULL;
Note the self join on tv_episodes.id in the WHERE clause. This avoids the sub-query route provided above.
As bf2020 points out, postgres does not support JOINs when conducting a DELETE query. The proposed solution of a sub-query made me think of the solution. Refine the SELECT query from above and employ it as a sub-query to a DELETE query statement:
DELETE FROM tv_episodes
WHERE id in (
SELECT ed.id
FROM tv_episodes AS ed
LEFT OUTER JOIN data AS nd ON
ed.file_name = nd.file_name AND
ed.path = nd.path
WHERE ed.cd_name = 'MediaLibraryDrive' AND nd.cd_name IS NULL
);
Sub-queries can often be inefficient consuming time and CPU resources with some database systems, especially MySQL. From my experience I try to avoid using a sub-query due to that inefficiency plus that such queries are sometimes an easy way out to honing one's skill like learning JOIN syntax.
Since postgre does not permit delete queries using join, the above is the solution that works.
Use the DELETE... USING syntax:
DELETE FROM tv_episodes USING data WHERE
tv_episodes.file_name = data.file_name AND
tv_episodes.path = data.path AND
tv_episodes.cd_name = 'MediaLibraryDrive' AND
data.cd_name IS NULL;
Instead of
DELETE ed
FROM tv_episodes AS ed
LEFT OUTER JOIN data AS nd ON
ed.file_name = nd.file_name AND
ed.path = nd.path
WHERE ed.cd_name = 'MediaLibraryDrive' AND nd.cd_name IS NULL;
please try
DELETE FROM tv_episodes
WHERE cd_name = 'MediaLibraryDrive' AND
(tv_episodes.filename, tv_episodes.path IN
(SELECT ed.filename,
ed.path
FROM tv_episodes AS ed
INNER JOIN data AS nd
ON ed.file_name = nd.file_name
AND ed.path = nd.path
WHERE nd.cd_name IS NULL)
)
;
JOIN is not valid in a DELETE query according to the postgresql documentation. You might need to concatenate the left and right parts of the IN expression.

T-SQL Delete command basing on table variable

I need to delete some rows from table where indexes are equal indexes in table variable
declare #m_table as table
(
number NUMERIC(18,0)
)
...
inserting some rows into #m_table
...
DELETE ct FROM [dbo].[customer_task] ct
inner join project_customer pc on pc.id_customer = #m_table.number
inner join customer_user cu on cu.id_project_customer = pc.id
WHERE ct.id_csr_user = cu.id AND ct.id_status = 1;
but this code generates an error: Must declare the scalar variable "#m_table" How to solve that ?
You probably have a 'GO' (a batch separator) in those '...'
Variable declarations do not span batches.
The error means that SQL is expecting you to treat #m_table like a standard table, rather than a scalar (int, bit, etc.) variable. Perhaps something like this will work?
DELETE ct FROM [dbo].[customer_task] ct
WHERE ct.id_csr_user IN (
SELECT cu.id FROM customer_user cu
INNER JOIN project_customer pc ON pc.id = cu.id_project_customer
WHERE pc.id_customer IN (SELECT number FROM #m_table.number)
) AND ct.id_status = 1;