I am actually developing a function in postgres. I'm using some traces to gather information from this one. I'm trying this piece of code that made me stuck somehow, and your help will make it easier to me.
--For Month-Year, we subtract the last part so we could have only "Month"
V_SQL_QRY := 'UPDATE ZONE AS ZN '||
'SET '||V_CAT_DMOIS||' = regexp_replace(EC.OCCURRENCE, ''\-[0-9]{4}'',''''), '
||V_CAT_ANNEE||' = SUBSTRING(EC.OCCURRENCE FROM ''....$''), '
||V_CAT_DM||' = ROUND(EC.MONTANT_BASE::numeric, 2) '||
'FROM ECHEANCE AS EC '||
'WHERE '||V_ZONE_C||' AND ZN.UNID = EC.ZONE_ID AND EC.CANEVAS_ID = '||V_TAX_ID||' AND EC.ETAT = ''P'' '||
'AND SUBSTRING(EC.OCCURRENCE FROM ''....$'')||convertmonth(regexp_replace(EC.OCCURRENCE,''\-[0-9]{4}'','''')) '||
'= (SELECT MAX(SUBSTRING(EH.OCCURRENCE FROM ''....$'')||convertmonth(regexp_replace(EH.OCCURRENCE,''\-[0-9]{4}'','''')) '||
'FROM ECHEANCE AS EH '||
'WHERE EH.ZONE_ID = EC.ZONE_ID AND EH.CANEVAS_ID = EC.CANEVAS_ID AND EH.ETAT = ''P'') '||
'AND NOT ( '||V_CAT_DMOIS||' IS NOT NULL '||
'AND '||V_CAT_ANNEE||' IS NOT NULL AND ' ||V_CAT_ANNEE||' SIMILAR TO ''\d{4}'' '||
'AND '||V_CAT_DM||' IS NOT NULL AND '||V_CAT_DM||' SIMILAR TO ''[\d\s]+[.]?\d*'' '|| -- On considère le montant avec '.' au lieu de la ','
'AND CAST(FORMAT('||V_CAT_DM||') AS DOUBLE PRECISION) > 0 '||
'AND '||V_CAT_ANNEE||' > EC.ANNEE_REFERENCE '||
'OR '||V_CAT_ANNEE||' = EC.ANNEE_REFERENCE '||
'AND convertmonth('||V_CAT_DMOIS||') > convertmonth(regexp_replace(EH.OCCURRENCE,''-[0-9]{4}'',''''))))';
EXECUTE V_SQL_QRY;
When executing it for the test, the traces table gave me the next error :
SQLSTATE=42601, SQLERRM=Syntax Error Near "FROM"
I'd be grateful for your help
this code is absolutely unreadable. Good help in this case is using RAISE NOTICE '%', V_SQL_QRY; before execution. When the dynamic string is complex, then function format and custom string separators helps:
EXECUTE format($str$ UPDATE foo SET %I='AHOJ' $str$, column_name)
Related
Here is the code in SAS, It finds the numeric columns with blank and replace with 0's
DATA dummy_table;
SET dummy_table;
ARRAY DUMMY _NUMERIC_;
DO OVER DUMMY;
IF DUMMY=. THEN DUMMY=0;
END;
RUN;
I am trying to replicate this in Redshift, here is what I tried
create or replace procedure sp_replace_null_to_zero(IN tbl_nm varchar) as $$
Begin
Execute 'declare ' ||
'tot_cnt int := (select count(*) from information_schema.columns where table_name = ' || tbl_nm || ');' ||
'init_loop int := 0; ' ||
'cn_nm varchar; '
Begin
While init_loop <= tot_cnt
Loop
Raise info 'init_loop = %', Init_loop;
Raise info 'tot_cnt = %', tot_cnt;
Execute 'Select column_name into cn_nm from information_schema.columns ' ||
'where table_name ='|| tbl_nm || ' and ordinal_position = init_loop ' ||
'and data_type not in (''character varying'',''date'',''text''); '
Raise info 'cn_nm = %', cn_nm;
if cn_nm is not null then
Execute 'Update ' || tbl_nm ||
'Set ' || cn_nm = 0 ||
'Where ' || cn_nm is null or cn_nm =' ';
end if;
init_loop = init_loop + 1;
end loop;
End;
End;
$$ language plpgsql;
Issues I am facing
When I pass the Input parameter here, I am getting 0 count
tot_cnt int := (select count(*) from information_schema.columns where table_name = ' || tbl_nm || ');'
For testing purpose I tried hardcode the table name inside proc, I am getting the error amazon invalid operation: value for domain information_schema.cardinal_number violates check constraint "cardinal_number_domain_check"
Is this even possible in redshift, How can I do this logic or any other workaround.
Need Expertise advise here!!
You can simply run an UPDATE over the table(s) using the NVL(cn_nm,0) function
UPDATE tbl_raw
SET col2 = NVL(col2,0);
However UPDATE is a fairly expensive operation. Consider just using a view over your table that wraps the columns in NVL(cn_nm,0)
CREATE VIEW tbl_clean
AS
SELECT col1
, NVL(col2,0) col2
FROM tbl_raw;
In my project we have to sometimes copy all the data from one schema into another. I automated this by simple truncate / insert into select * script, but sooner realized that this way is not tolerant to changes in the source schema (adding/deleteing tables required modifying the script). So today I decided to change it to PL/PGSQL script which creates tables and copies data using dynamic queries. My first implementation was something like this:
do
$$
declare
source_schema text := 'source_schema';
dest_schema text := 'dest_schema';
obj_name text;
source_table text;
dest_table text;
alter_columns text;
begin
for dest_table in
select table_schema || '.' || table_name
from information_schema.tables
where table_schema = dest_schema
order by table_name
loop
execute 'drop table ' || dest_table;
end loop;
raise notice 'Data cleared';
for obj_name in
select table_name
from information_schema.tables
where table_schema = source_schema
order by table_name
loop
source_table := source_schema || '.' || obj_name;
dest_table := dest_schema || '.' || obj_name;
execute 'create unlogged table ' || dest_table
|| ' (like ' || source_table || ' including comments)';
alter_columns := (
select string_agg('alter column ' || column_name || ' drop not null', ', ')
from information_schema.columns
where table_schema = dest_schema and table_name = obj_name
and is_nullable = 'NO');
if alter_columns is not null then
execute 'alter table ' || dest_table || ' ' || alter_columns;
end if;
execute 'insert into ' || dest_table || ' select * from ' || source_table;
raise notice '% done', obj_name;
end loop;
end;
$$
language plpgsql;
As destination schema is read only, I create it without constrants to reach maximum performance. I don't think that NOT NULL constraints is a big deal, but I decided to leave everything here as it was.
This solution worked perfectly but I noticed that it was taking longer time to copy data in comparison to static script. Not dramatically, but steadily it took 20-30 seconds longer than static script.
I decided to investigate it. My first step was to comment insert into select * statement to find out what time takes everything else. It shown that it takes only half a second to clear and recreate all tables. My clue was that INSERT statements somehow work longer in procedureal context.
Then I added measuring of the execution time:
ts := clock_timestamp();
execute 'insert into ...';
raise notice 'obj_name: %', clock_timestamp() - ts;
Also I performed the old static script with \timing in psql. But this shown that my assumption was wrong. All insert statements took more or less the same time, predominantly even faster in dynamic script (I suppose it was due to autocommit and network roundtrips after each statement in psql). However the overal time of dynamic script was again longer than time of static script.
Mysticism?
Then I added very verbose logging with timestamps like this:
raise notice '%: %', clock_timestamp()::timestamp(3), 'label';
I discovered that sometimes create table executes immediately, but sometimes it takes several seconds to finish. OK, but how come all these statements for all tables took just milliseconds to complete in my first experiment?
Then I basically split one loop into two: first one creates all the tables (and we now know it takes just milliseconds) and the second one only inserts data:
do
$$
declare
source_schema text := 'onto_oper';
dest_schema text := 'onto';
obj_name text;
source_table text;
dest_table text;
alter_columns text;
begin
raise notice 'Clearing data...';
for dest_table in
select table_schema || '.' || table_name
from information_schema.tables
where table_schema = dest_schema
order by table_name
loop
execute 'drop table ' || dest_table;
end loop;
raise notice 'Data cleared';
for obj_name in
select table_name
from information_schema.tables
where table_schema = source_schema
order by table_name
loop
source_table := source_schema || '.' || obj_name;
dest_table := dest_schema || '.' || obj_name;
execute 'create unlogged table ' || dest_table
|| ' (like ' || source_table || ' including comments)';
alter_columns := (
select string_agg('alter column ' || column_name || ' drop not null', ', ')
from information_schema.columns
where table_schema = dest_schema and table_name = obj_name
and is_nullable = 'NO');
if alter_columns is not null then
execute 'alter table ' || dest_table || ' ' || alter_columns;
end if;
end loop;
raise notice 'All tables created';
for obj_name in
select table_name
from information_schema.tables
where table_schema = source_schema
order by table_name
loop
source_table := source_schema || '.' || obj_name;
dest_table := dest_schema || '.' || obj_name;
execute 'insert into ' || dest_table || ' select * from ' || source_table;
raise notice '% done', obj_name;
end loop;
end;
$$
language plpgsql;
Surprisingly it fixed everything! This version works faster than the old static script!
We are coming to very weird conclusion: create table after inserts sometime may take long time. This is very frustrating. Despite the fact I solved my problem I don't understand why it happens. Does anybody have any idea?
EDIT
It seems my issue is when this select statement returns null (which is the case I'm trying to handle - when it returns null, I want my new value to be -999). How can I go about doing this if it errors out whenever a null is found?
ORIGINAL
I have read every other SO post I could find regarding this error, but none of which seemed to address the root of my issue.
The error is pretty straightforward - one of my arguments within my EXECUTE statement is null. Great. However, I print out each of the values that make up my EXECUTE statement right before it gets called, and I can clearly see that none of the values are null.
Code:
CREATE FUNCTION inform_icrm_prob_flow_query(tablename text, location_id int,
product_date_str text, lead_time_start int,
lead_time_end int, first_member_id int,
last_member_id int, dest_file text)
RETURNS void AS $$
DECLARE
count int;
product_date TIMESTAMPTZ;
interval_lead_time_start text;
interval_lead_time_end text;
curr_value double precision;
query text;
BEGIN
product_date := product_date_str::TIMESTAMPTZ;
count := first_member_id;
curr_value := 0;
interval_lead_time_start := ''''|| product_date ||'''::timestamptz +
interval '''||lead_time_start||' hours''';
interval_lead_time_end := ''''|| product_date ||'''::timestamptz +
interval '''||lead_time_end||' hours'' -
interval ''6 hours''';
--create our temporary table and populate it's date column
EXECUTE 'CREATE TEMPORARY TABLE temp_table_icrm_prob_flow AS
SELECT * FROM generate_series('||interval_lead_time_start || ',' ||
interval_lead_time_end || ', ''6 hours'')
AS date_valid';
LOOP
EXIT WHEN count > last_member_id;
IF NOT EXISTS(
SELECT 'date_valid'
FROM information_schema.columns
WHERE table_name='temp_table_icrm_prob_flow'
and column_name='value'||count||'')
THEN
EXECUTE 'ALTER TABLE temp_table_icrm_prob_flow ADD COLUMN value' || count
|| ' double precision DEFAULT -999';
END IF;
raise notice 'tablename: %', tablename;
raise notice 'location_id: %', location_id;
raise notice 'product_date: %', product_date;
raise notice 'count: %', count;
query := 'SELECT value FROM '|| tablename ||'
INNER JOIN temp_table_icrm_prob_flow
ON (temp_table_icrm_prob_flow.date_valid = '|| tablename ||'.date_valid)
WHERE '|| tablename ||'.id_location = '|| location_id ||'
AND '|| tablename ||'.date_product = '''|| product_date ||'''
AND '|| tablename ||'.id_member = '|| count ||'';
EXECUTE query INTO curr_value;
EXECUTE 'UPDATE temp_table_icrm_prob_flow
SET value'|| count ||' = COALESCE('|| curr_value ||', -999)';
count := count + 1;
END LOOP;
EXECUTE 'ALTER TABLE temp_table_icrm_prob_flow DROP COLUMN date_valid';
EXECUTE 'COPY temp_table_icrm_prob_flow TO '''||dest_file||''' DELIMITER '','' CSV';
EXECUTE 'DROP TABLE temp_table_icrm_prob_flow';
END;
$$ LANGUAGE plpgsql;
Output:
NOTICE: tablename: inform_tseries_data_basin_proc_fcst_prob_flow
NOTICE: location_id: 38
NOTICE: product_date: 2015-02-05 12:00:00+00
NOTICE: count: 1
ERROR: query string argument of EXECUTE is null
CONTEXT: PL/pgSQL function inform_icrm_prob_flow_query(text,integer,text,integer,integer,integer,integer,text) line 38 at EXECUTE
If none of the variables I am passing in are null, and the only other thing referenced is a temp table that I know exists, what could be causing this error?
Note: when changing my query to:
query := 'SELECT value FROM '|| tablename ||' WHERE '|| tablename ||'.id_location = '|| location_id ||' AND '|| tablename ||'.date_product = '''|| product_date ||''' AND '|| tablename ||'.id_member = '|| count ||' AND temp_table_icrm_prob_flow.date_va lid = '|| tablename ||'.date_valid';
I get the following error:
NOTICE: tablename: inform_tseries_data_basin_proc_fcst_prob_flow
NOTICE: location_id: 38
NOTICE: product_date: 2015-02-05 12:00:00+00
NOTICE: count: 1
ERROR: missing FROM-clause entry for table "temp_table_icrm_prob_flow"
LINE 1: ..._data_basin_proc_fcst_prob_flow.id_member = 1 AND temp_table...
^
QUERY: SELECT value FROM inform_tseries_data_basin_proc_fcst_prob_flow WHERE inform_tseries_data_basin_proc_fcst_prob_flow.id_location = 38 AND inform_tseries_data_basin_proc_fcst_prob_flow.date_product = '2015-02-05 12:00:00+00' AND inform_tseries_data_basin_proc_fcst_prob_flow.id_member = 1 AND temp_table_icrm_prob_flow.date_valid = inform_tseries_data_basin_proc_fcst_prob_flow.date_valid
CONTEXT: PL/pgSQL function inform_icrm_prob_flow_query(text,integer,text,integer,integer,integer,integer,text) line 35 at EXECUTE
Sorry for small offtopic. Your code is pretty unreadable (and SQL injecttion vulnerable). There are some techniques, that you can use:
Use clause USING of EXECUTE statement for usual parameters.
DO $$
DECLARE
tablename text := 'mytab';
from_date date := CURRENT_DATE;
BEGIN
EXECUTE 'INSERT INTO ' || quote_ident(tablename) || ' VALUES($1)'
USING from_date;
END
$$;
This code will be safe (due using quote_ident function), little bit faster (due using binary value of from_date variable - removed multiple string<->date conversions and little bit more readable (because string expression is shorter).
Use function format. The building query string will be shorter and more readable (table aliases helps too):
query := format('
SELECT value
FROM %I _dtn
INNER JOIN temp_table_icrm_prob_flow t ON t.date_valid = _dtn.date_valid
WHERE _dtn.id_location = $1
AND _dtn.date_product = $2
AND _dtd.id_member = $3'
, tablename);
EXECUTE query INTO curr_value USING location_id, product_date, count;
Using variables named like important SQL keywords and identifier is wrong idea - names count, values are wrong.
The error message is clean - you are using the identifier temp_table_icrm_prob_flow.date_valid, but the table temp_table_icrm_prob_flow is not mentioned in query. The query missing JOIN part.
When I run the following command from a function I defined, I get the error "EXECUTE of SELECT ... INTO is not implemented". Does this mean the specific command is not allowed (i.e. "SELECT ...INTO")? Or does it just mean I'm doing something wrong? The actual code causing the error is below. I apologize if the answer is already out here, however I looked and could not find this specific error. Thanks in advance... For whatever it's worth I'm running 8.4.7
vCommand = 'select ' || stmt.column_name || ' as id ' ||
', count(*) as nCount
INTO tmpResults
from ' || stmt.table_name || '
WHERE ' || stmt.column_name || ' IN (select distinct primary_id from anyTable
WHERE primary_id = ' || stmt.column_name || ')
group by ' || stmt.column_name || ';';
EXECUTE vCommand;
INTO is ambiguous in this use case and then is prohibited there.
You can use a CREATE TABLE AS SELECT instead.
CREATE OR REPLACE FUNCTION public.f1(tablename character varying)
RETURNS integer
LANGUAGE plpgsql
AS $function$
begin
execute 'create temp table xx on commit drop as select * from '
|| quote_ident(tablename);
return (select count(*) from xx);
end;
$function$
postgres=# select f1('omega');
f1
────
2
(1 row)
I need a store procedure that loop in a table that returns a name of a table in the db2 and depending from that name i need to do a select statement from the named table. i have tried doing it with an 'execute immediate' in so many ways that a lost the count here is an example of the execute immediate:
set insertstring = 'INSERT INTO pribpm.TEMP_T_TOQUE_CICLO (idSemana,tiempo_ciclo,tiempo_toque)
SELECT to_number(to_char( '''|| ' time_stamp ' ||''' ,' || ' IW ' || ')) ,SUM(KPITOTALTIMECLOCK),SUM(s.KPIEXECUTIONTIMECLOCK) FROM ' || TABLA || ' where to_number(to_char( '''|| ' time_stamp ' ||''' ,' || ' IW ' || ')) between ' || (to_number(to_char(FECHA,'IW'))-3) || ' and ' || to_number(to_char(FECHA,'IW')) || ' GROUP BY to_number(to_char('''|| ' time_stamp ' ||''' ,' || ' IW ' || '))';
PREPARE stmt FROM insertstring;
EXECUTE IMMEDIATE insertstring;
where tabla is a string that contains the name of the table and fecha is a date in timestamp type
besides i've tried it with cursors like this
set select_ = 'SELECT time_stamp, KPITOTALTIMECLOCK, KPIEXECUTIONTIMECLOCK FROM ' || tabla;
PREPARE stmt FROM select_;
FOR v2 AS
c2 cursor for
execute select_
do
if to_number(to_char(time_stamp,'IW')) between
(to_number(to_char(fecha,'IW'))-3) and to_number(to_char(fecha,'IW')) then
--something here
end if;
END FOR;
but with no success.
may you or may someone please help me clear my error or giving some other idea about this im trying to do?
all this in db2 environment
Write a procedure and loop tables from SYSCAT.TABLES to get the table name and again loop to fire a select query for each and every table.
I am not 100% sure as it has been a long time I worked on db2