How to Loop json Data and count the values in postgres - postgresql

CREATE OR REPLACE FUNCTION file_compare()
RETURNS text LANGUAGE 'plpgsql'
COST 100 VOLATILE AS $BODY$
DECLARE
filedata text[];
fpo_data jsonb;
inddata jsonb;
f_cardholderid text;
f_call_receipt text;
i INT;
BEGIN
SELECT json_agg((fpdata))::jsonb
FROM (SELECT fo_data AS fpdata
FROM fpo
LIMIT 100
) t INTO fpo_data;
i=0;
FOR inddata IN SELECT * FROM jsonb_array_elements(fpo_data) LOOP
f_cardholderid := (inddata->>0)::JSONB->'cardholder_id'->>'value';
f_call_receipt := (inddata->>0)::JSONB->'call_receipt_date'->>'value';
f_primary_key := f_cardholderid || f_auth_clm_number;
filedata[i] := jsonb_build_object(
'fc_primary_key',f_primary_key
);
i := i+1;
END LOOP;
RAISE NOTICE 'PRINTING DATA %', filedata;
END;
$BODY$;
I am getting the filedata as below
NOTICE: PRINTING DATA ={"{\"fc_primary_key\": \"A1234567892017/06/27\"}","{\"fc_primary_key\": \"A1234567892017/06/27\"}","{\"fc_primary_key\": \"A1234567892017/08/07\"}","{\"fc_primary_key\": \"A1234567892017/08/07\"}","{\"fc_primary_key\": \"A1234567892017/08/07\"}","{\"fc_primary_key\": \"A1234567892017/08/07\"}","{\"fc_primary_key\": \"A1234567892017/08/07\"}","{\"fc_primary_key\": \"A1234567892024/03/01\"}","{\"fc_primary_key\": \"A12345678945353\"}","{\"fc_primary_key\": \"A1234567892023/11/22\"}","{\"fc_primary_key\": \"A12345678945252\"}","{\"fc_primary_key\": \"A1234567892017-07-01\"}"}
Now I want to iterate this filedata and get each fc_primary_key value and check the count how many times it appeared in entire json data
Note: Each fc_primary_key has to be verified only with the values which are present after it. It should not compare with the fc_primary keys before it.
For example if I check the third element which is "A1234567892017/08/07", it appeared 4 times after its position. So the count must be 4.
Where as the same "A1234567892017/08/07" is there in seventh element, but there are no more "A1234567892017/08/07" after seventh position. So the count must be zero "0"
How do I loop the data and get the count, as I am new to postgres I am unable to find the solution. Please help!!

I was able to get the result you describe with the code below. By unnesting the data, you are able to take advantage of regular SQL syntax (offset, grouping, counting) which are the crux of the problem you described.
DO
$body$
DECLARE
fildata TEXT[] = ARRAY ['{''fc_primary_key'': ''A1234567892017/06/27''}','{''fc_primary_key'': ''A1234567892017/06/27''}','{''fc_primary_key'': ''A1234567892017/08/07''}','{''fc_primary_key'': ''A1234567892017/08/07''}','{''fc_primary_key'': ''A1234567892017/08/07''}','{''fc_primary_key'': ''A1234567892017/08/07''}','{''fc_primary_key'': ''A1234567892017/08/07''}','{''fc_primary_key'': ''A1234567892024/03/01''}','{''fc_primary_key'': ''A12345678945353''}','{''fc_primary_key'': ''A1234567892023/11/22''}','{''fc_primary_key'': ''A12345678945252''}','{''fc_primary_key'': ''A1234567892017-07-01''}'];
count INTEGER;
BEGIN
FOR i IN 1 .. array_length(fildata, 1) LOOP
SELECT count(*) - 1
INTO count
FROM (
SELECT unnest(fildata) AS x OFFSET (i - 1)
) AS t
WHERE x = fildata[i]
GROUP BY x;
RAISE NOTICE 'Row % appears % times after the current', fildata[i], count;
END LOOP;
END
$body$ LANGUAGE plpgsql;
Alternatively, you can get the entire set of data in a single statement (if that would be helpful) by using windowing instead of offset.
SELECT t
, count(*) OVER (PARTITION BY t ORDER BY rn RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) - 1 AS count
FROM (
SELECT row_number() OVER () AS rn, t
FROM unnest(
ARRAY ['{''fc_primary_key'': ''A1234567892017/06/27''}','{''fc_primary_key'': ''A1234567892017/06/27''}','{''fc_primary_key'': ''A1234567892017/08/07''}','{''fc_primary_key'': ''A1234567892017/08/07''}','{''fc_primary_key'': ''A1234567892017/08/07''}','{''fc_primary_key'': ''A1234567892017/08/07''}','{''fc_primary_key'': ''A1234567892017/08/07''}','{''fc_primary_key'': ''A1234567892024/03/01''}','{''fc_primary_key'': ''A12345678945353''}','{''fc_primary_key'': ''A1234567892023/11/22''}','{''fc_primary_key'': ''A12345678945252''}','{''fc_primary_key'': ''A1234567892017-07-01''}']) AS t
) AS x
ORDER BY rn;

Related

stored procedure - pass multiple values for IN clause

I'm writing a stored procedure for Postgres, just do select * from table where id in (value1, value2, ...).
These values will be getting from the variable.
My code:
CREATE OR REPLACE PROCEDURE record_example(v_name varchar(100), v_id int)
LANGUAGE plpgsql
AS $$
DECLARE
rec RECORD;
BEGIN
FOR rec IN select id, updated from mytable where names in (v_name) and id=v_id
LOOP
RAISE INFO 'id = % and updated = %', rec.id, rec.updated;
END LOOP;
END;
$$;
This actually works if I use single value for v_name.
Ex:
call record_example('myname',101);
But if I do multiple values, its not working.
HELP 1 :
call record_example('myname, your_name',101);
It just returned CALL, that's it. Nothing happened.
HELP 2:
Sometimes the v_id variable is optional, so that time the FOR loop should inclue the id=v_id
Ex:
FOR rec IN select id, updated from mytable where names in (v_name)
LOOP
RAISE INFO 'id = % and updated = %', rec.id, rec.updated;
END LOOP;

ERROR: query has no destination for result data: Postgresql

I am using Postgresql11 and a function that works well in a single run fails when I add a LOOP statement with
"ERROR: query has no destination for result data HINT: If you want to
discard the results of a SELECT, use PERFORM instead."
The function has VOID as return value, selects data from a source table into a temp table, calculates some data and inserts the result into a target table. The temp table is then dropped and the function ends. I would like to repeat this procedure in defined intervals and have included a LOOP statement. With LOOP it does not insert into the target table and does not actually loop at all.
create function transfer_cs_regular_loop(trading_pair character varying) returns void
language plpgsql
as
$$
DECLARE
first_open decimal;
first_price decimal;
last_close decimal;
last_price decimal;
highest_price decimal;
lowest_price decimal;
trade_volume decimal;
n_trades int;
start_time bigint;
last_entry bigint;
counter int := 0;
time_frame int := 10;
BEGIN
WHILE counter < 100 LOOP
SELECT max(token_trades.trade_time) INTO last_entry FROM token_trades WHERE token_trades.trade_symbol = trading_pair;
RAISE NOTICE 'Latest Entry: %', last_entry;
start_time = last_entry - (60 * 1000);
RAISE NOTICE 'Start Time: %', start_time;
CREATE TEMP TABLE temp_table AS
SELECT * FROM token_trades where trade_symbol = trading_pair and trade_time > start_time;
SELECT temp_table.trade_time,temp_table.trade_price INTO first_open, first_price FROM temp_table ORDER BY temp_table.trade_time ASC FETCH FIRST 1 ROW ONLY;
SELECT temp_table.trade_time,temp_table.trade_price INTO last_close, last_price FROM temp_table ORDER BY temp_table.trade_time DESC FETCH FIRST 1 ROW ONLY;
SELECT max(temp_table.trade_price) INTO highest_price FROM temp_table;
SELECT min(temp_table.trade_price) INTO lowest_price FROM temp_table;
SELECT INTO trade_volume sum(temp_table.trade_quantity) FROM temp_table;
SELECT INTO n_trades count(*) FROM temp_table;
INSERT INTO candlestick_data_5min_test(open, high, low, close, open_time, close_time, volume, number_trades, trading_pair) VALUES (first_price, highest_price, lowest_price, last_price, first_open, last_close, trade_volume, n_trades, trading_pair);
DROP TABLE temp_table;
counter := counter + 1;
SELECT pg_sleep(time_frame);
RAISE NOTICE '**************************Counter: %', counter;
END LOOP;
END;
$$;
The error refers to the last SELECT statement in the function. If there is a SELECT without INTO it will always return results. When there's no LOOP this result will be used as the return value of the function (even if it is void).
When you add a LOOP there can't be any SELECT without INTO inside the loop because a single return value would be needed and now there will be many. In this case you need to use PERFORM which does exactly the same thing as a SELECT but discards the results.
Therefore change the last SELECT into a PERFORM and the error will go away:
PERFORM pg_sleep(time_frame);

Update table every 1000 rows

I am trying to do an update on a specific record every 1000 rows using Postgres. I am looking for a better way to do that. My function is described below:
CREATE OR REPLACE FUNCTION update_row()
RETURNS void AS
$BODY$
declare
myUID integer;
nRow integer;
maxUid integer;
BEGIN
nRow:=1000;
select max(uid_atm_inp) from tab into maxUid where field1 = '1240200';
loop
if (nRow > 1000 and nRow < maxUid) then
select uid from tab into myUID where field1 = '1240200' and uid >= nRow limit 1;
update tab
set field = 'xxx'
where field1 = '1240200' and uid = myUID;
nRow:=nRow+1000;
end if;
end loop;
END; $BODY$
LANGUAGE plpgsql VOLATILE
How can I improve this procedure? I think there is something wrong. The loop does not end and takes too much time.
To perform this task in SQL, you could use the row_number window function and update only those rows where the number is divisible by 1000.
Your loop doesn't finish because there is no EXIT or RETURN in it.
I doubt you could ever rival the performance of a standard SQL update with a procedural loop. Instead of doing it a row at a time, just do it all as a single statement:
with t2 as (
select
uid, row_number() over (order by 1) as rn
from tab
where field1 = '1240200'
)
update tab t1
set field = 'xxx'
from t2
where
t1.uid = t2.uid and
mod (t2.rn, 1000) = 0
Per my comment, I am presupposing what you mean by "every 1000th row," as without some designation of how to determine what tuple is what row number. That is easily edited by changing the "order by" criteria.
Adding a second where clause on the update (t1.field1 = '1240200') can't hurt but might not be necessary if these are nested loop.
This might be notionally similar to what Laurenz has in mind.
I solved this way:
declare
myUID integer;
nRow integer;
rowNum integer;
checkrow integer;
myString varchar(272);
cur_check_row cursor for select uid , row_number() over (order by 1) as rn, substr(fieldxx,1,244)
from table where field1 = '1240200' and uid >= 1000 ORDER BY uid;
BEGIN
open cur_check_row;
loop
fetch cur_check_row into myUID, rowNum, myString;
EXIT WHEN NOT FOUND;
select mod(rowNum, 1000) into checkrow;
if checkrow = 0 then
update table
set fieldxx= myString||'O'
where uid in (myUID);
end if;
end loop;
close cur_check_row;

Using a row as a table in a query within a function PLpgSQL

I am trying to write a plpgsql function that loops through a table. On each loop, it pulls a row from the table, stores it in a record, then uses that record in the join clause of a query. Here is my code:
CREATE OR REPLACE FUNCTION "testfncjh2" () RETURNS int
IMMUTABLE
SECURITY DEFINER
AS $dbvis$
DECLARE
counter int;
tablesize int;
rec1 record;
tablename text;
rec2 record;
BEGIN
counter = 0;
for rec1 in SELECT * FROM poilocations_sridconv loop
raise notice 'here';
execute $$ select count(*) from $$||rec1||$$ $$ into tablesize;
while counter < tablesize loop
counter = counter + 1;
raise notice 'hi';
execute $$ select count(*) from cities_sridconv $$ into tablesize;
end loop;
end loop;
return counter;
END;
$dbvis$ LANGUAGE plpgsql;
Each time I run this, I get the following error:
ERROR: could not find array type for data type record
Is there a way to use the row as a table in the query within the nested loops?
My end goal is to build a function that loops through a table, pulling a row from that table on each loop. In each loop, a number COUNTER is computed using the row, then a query is executed depending on the row and COUNTER. Knowing that this code is currently very flawed, I am posting it below to give an idea of what I am trying to do:
CREATE OR REPLACE FUNCTION "testfncjh" () RETURNS void
IMMUTABLE
SECURITY DEFINER
AS $dbvis$
DECLARE
counter int;
tablesize int;
rec1 record;
tablename text;
rec2 record;
BEGIN
for rec1 in SELECT * FROM poilocations_sridconv loop
counter = 0;
execute $$ select count(*)
from $$||rec1||$$ a
join
cities_srid_conv b
on right(a.geom_wgs_pois,$$||counter||$$) = right(b.geom_wgs_pois,$$||counter||$$) $$ into tablesize;
raise notice 'got through first execute';
while tablesize = 0 loop
counter = counter + 1;
execute $$ select count(*)
from '||rec1||' a
join
cities_srid_conv b
on right(a.geom_wgs_pois,'||counter||') = right(b.geom_wgs_pois,'||counter||') $$ into tablesize;
raise notice 'hi';
end loop;
EXECUTE
'select
poiname,
name as cityname,
postgis.ST_Distance(postgis.ST_GeomFromText(''POINT(poilat poilong)''),
postgis.ST_GeomFromText(''POINT(citylat citylong)'')
) as distance
from (select a.poiname,
a.latitude::text as poilat,
a.longitude::text as poilong,
b.geonameid,
b.name,
b.latitude as citylat,
b.longitude as citylong
from '||rec1||' a
join cities_srid_conv b
on right(a.geom_wgs_pois,'||counter||') = right(b.geom_wgs_pois,'||counter||'))
) x
order by distance
limit 1'
poi_cities_match (poiname, cityname, distance); ------SQL STATEMENT TO INSERT CLOSEST CITY TO TABLE POI_CITIES_MATCH
end loop;
END;
$dbvis$ LANGUAGE plpgsql;
I am running on a PostgreSQL 8.2.15 database.
Also, sorry for reposting. I had to remove some data from the original.
I think you should be able to use composite types for what you want. I simplified your top example and used composite types in the following way.
CREATE OR REPLACE FUNCTION "testfncjh2" () RETURNS int
IMMUTABLE
SECURITY DEFINER
AS $dbvis$
DECLARE
counter int;
tablesize int;
rec1 poilocations_sridconv;
tablename text;
rec2 record;
BEGIN
counter = 0;
for rec1 in SELECT * FROM poilocations_sridconv loop
raise notice 'here';
select count(*) FROM (select (rec1).*)theRecord into counter;
end loop;
return counter;
END;
$dbvis$ LANGUAGE plpgsql;
The main changes being the rec1 poilocations_sridconv; line and using (select (rec1).*)
Hope it helps.
EDIT: I should note that the function is not doing the same thing as it does in the question above. This is just as an example of how you could use a record as a table in a query.
You have a few issues with your code (apart, perhaps, from your logic).
Foremost, you should not use a record as a table source in a JOIN. Instead, filter the second table for rows that match some field from the record.
Second, you should use the format() function instead of assembling strings with the || operator. But you can't because you are using the before-prehistoric version 8.2. This is from the cave-painting era (yes, it's that bad). UPGRADE!
Thirdly, don't over-complicate your queries. The sub-query is not necessary here.
Put together, the second dynamic query from your real code would reduce to this:
EXECUTE format(
'SELECT b.name,
postgis.ST_Distance(postgis.ST_SetSRID(postgis.ST_MakePoint(%1$I.longitude, %1$I.latitude), 4326),
postgis.ST_SetSRID(postgis.ST_MakePoint(b.longitude, b.latitude), 4326))
FROM cities_srid_conv b
WHERE right(%1$I.geom_wgs_pois, %2$L) = right(b.geom_wgs_pois, %2$L)
ORDER BY distance
LIMIT 1', rec1, counter) INTO cityname, distance;
poi_cities_match (rec1.poiname, cityname, distance); ------SQL STATEMENT TO INSERT CLOSEST CITY TO TABLE POI_CITIES_MATCH
Here %1$I refers to the first parameter after the string, which is an idenifier: rec1; %2$L is the second parameter, being a literal value: counter. I leave it to yourself to re-work this to a pre-8.4 string concatenation. The results from the query are stored in a few additional variables which you can then use in the following function call.
Lastly, you had longitude and latitude reversed. In PostGIS longitude always comes first.

Postgres - Dynamically referencing columns from record variable

I'm having trouble referencing record variable type columns dynamically. I found loads of tricks online, but with regards to triggers mostly and I really hope the answer isn't "it can't be done"... I've got a very specific and simple need, see my example code below;
First I have an array containing a list of column names called "lCols". I loop through a record variable to traverse my data, replacing values in a paragraph which exactly match my column names.
DECLARE lTotalRec RECORD;
DECLARE lSQL text;
DECLARE lCols varchar[];
p_paragraph:= 'I am [names] and my surname is [surname]';
lSQL :=
'select
p.names,
p.surname
from
person p
';
FOR lTotalRec IN
execute lSQL
LOOP
-- Loop through the already created array of columns to replace the values in the paragraph
FOREACH lVal IN ARRAY lCols
LOOP
p_paragraph := replace(p_paragraph,'[' || lVal || ']',lTotalRec.lVal); -- This is where my problem is, because lVal is not a column of lTotalRec directly like this
END LOOP;
RETURN NEXT;
END LOOP;
My return value is the paragraph amended for each record in "lTotalRec"
You could convert your record to a json value using the row_to_json() function. Once in this format, you can extract columns by name, using the -> and ->> operators.
In Postgres 9.4 and up, you can also make use of the more efficient jsonb type.
DECLARE lJsonRec jsonb;
...
FOR lTotalRec IN
execute lSQL
LOOP
lJsonRec := row_to_json(lTotalRec)::jsonb;
FOREACH lVal IN ARRAY lCols
LOOP
p_paragraph := replace(p_paragraph, '[' || lVal || ']', lJsonRec->>lVal);
END LOOP;
RETURN NEXT;
END LOOP;
See the documentation for more details.
You can convert a row to JSON using row_to_json(), and then retrieve the column names using json_object_keys().
Here's an example:
drop table if exists TestTable;
create table TestTable (col1 text, col2 text);
insert into TestTable values ('a1', 'b1'), ('a2', 'b2');
do $$declare
sql text;
rec jsonb;
col text;
val text;
begin
sql := 'select row_to_json(row) from (select * from TestTable) row';
for rec in execute sql loop
for col in select * from jsonb_object_keys(rec) loop
val := rec->>col;
raise notice 'col=% val=%', col, val;
end loop;
end loop;
end$$;
This prints:
NOTICE: col=col1 val=a1
NOTICE: col=col2 val=b1
NOTICE: col=col1 val=a2
NOTICE: col=col2 val=b2
DO