PostgreSQL: Multiple Row Result Using WHILE LOOP - postgresql

I have data as seen below.
Those data show the fuel tank level on each timestamp, ftakhir is ga.ts's fuel tank level and ftawal is gb.ts's fuel tank level. As you can see gb.ts and ga.ts have a gap between them. I want to know the value of fuel tank level for each timestamp being recorded between ga.ts and gb.ts.
I use WHILE LOOP
do $$
declare
t timestamp := '2020-11-30 13:53:10.596645';
s timestamp := '2020-11-30 14:04:10.797056';
id varchar := '05b92dc749ed3a09b35273cac3f3d68aabfcf737';
begin
while t <= s loop
PERFORM g.time FROM gpsapi g
WHERE g.idalat = id AND g.time >= t AND g.time <= t + '1 minute 10 seconds'::INTERVAL;
t := t + '1 minutes 10 seconds'::INTERVAL;
end loop;
end$$;
Using While Loop, I want it to give multiple result of timestamp between ga.ts and gb.ts, then I need to calculate the traveled distance between the known timestamp.
How to make it as a function so it can give multiple rows of result? Because I am a bit confused how to use the select then while-loop in postgresql

Related

PostgreSQL - Comparing value based on condition (<, >, =) which written on a column?

I have this sample database:
Table 1:
Type Condition_weight Price
A >50 1000
A >10 & <50 500
A <10 100
As I remember, I can do a comparison on the Condition_weight without doing too much on query.
My expectation query is something like this:
select Price from Table_1
where Type = 'A'
and {my_input_this_is a number} satisfy Condition_weight
I read it somewhere about this solution but cant find it again.
You can create a function that returns true - you will have to do the logic to extract min and max and compare value.
pseudo code...
CREATE FUNCTION conditionWeightIsSatisfed(number weight)
BEGIN
set #minValue = 0;
set #MaxValue = 1000;
..... do your conversion of the text values.....
select weight >= #minValue and weight <= #maxValue
END

Calculating sum with no direct join column

I have a table (ShipJourneys) where I need to calculate the Total Fuel Consumed which is a float value. See the image below.
This value is obtained by summing all the individual consumers of fuel for a given vessel over the timeframe specified. This data is contained in a second table.
Boxed in area in red shows there were 5 fuel consumers (specified by the FK_RmaDataSumsystemConfigID) and that 3 of the consumers had burned 0 units of fuel and 2 had each burned 29.
To calculate the totalFuelConsumed for that range of time frames, for a given vessel (stipulated by the FK_RmaID), the following query could be used
Select sum(FuelCalc)
from FuelCalc
where Timestamp >= '2019-07-24 00:00:00'
and Timestamp <= '2019-07-24 00:02:00'
and FK_RmaID = 660
Using something like the query below does not work, resulting in bogus values
UPDATE ShipJourneys
SET TotalFuelConsumed =
(Select sum(FuelCalc) from FuelCalc as f
WHERE f.timestamp >= StartTimeUTC
and f.timestamp <= EndTimeUTC
and f.FK_RmaID = FK_RmaID)
Any suggestions on how I could join them
You could try something like that:
UPDATE myTable // Put the table correct name here
SET TotalFuelConsumed =
Select sum(FuelUsed) from FuelTimeTbl as fuelTbl
WHERE fuelTbl.timestamp >= '2019-10-21 22:13:55.000'
and fuelTbl.imestamp <= '2019-11-27 17:10:58.000'
and fuelTbl.FK_RmaID = myTable.RmaID // Put the correct attribute name

Restructuring/Optimizing: Avoiding table scan and Select statement is slower in plsql function

I have a PL/pgSQL function to check if a point is in a polygon. To start I want to do an AABB test on the min/max latitude and longitudes so I don't have to do the raycast. I'm doing the following inside the function in order to grab the minimums and maximums.
My problem is that each select max() / select min() statements take about 500ms to execute inside the function. If I do the same statements outside the function, the queries take about 20 ms each. Why are they so slow inside the function?
select max(latitude) into maxLat from points where location=name_input;
select max(longitude) into maxLong from points where location=name_input;
select min(latitude) into minLat from points where location=name_input;
select min(longitude) into minLong from points where location=name_input;
Here's the complete function. As you can guess from the code, I know very little SQL and I'm writing this for both postgresql and oracle (so some parts might just be a bad port, like having two arrays for lat/long instead of one array of points, which I did in oracle). I know that my call is really slow and the plan shows that it does a table scan even though I index the function and the columns on it. I was told in another question that it's impossible to index on my function because I pass in a string as a variable, so I'm trying to figure out how to fix it.
CREATE OR REPLACE FUNCTION GEOLOCATION_CONTAINS
(
name_input IN VARCHAR, --Name of the geofilter
lat_in IN DOUBLE PRECISION, --latitude of the point to test
long_in IN DOUBLE PRECISION --longitude of the point to test
)
RETURNS INTEGER
AS $$
DECLARE
j int := 0; --index to previous point
inside int := 0; -- If the point is inside or not
numPoints int := 0; --Total number of points in the geo filter
pointsLAT DOUBLE PRECISION[]; --An array of latitudes
pointsLONG DOUBLE PRECISION[]; --An array of longitudes
maxLat double precision := 0.0;
maxLong double precision := 0.0;
minLat double precision := 0.0;
minLong double precision := 0.0;
BEGIN
--Populate the array of points by grabbing all the points in a filter
--The convention seems to be that order of a geo filter's points is defined by the order of their IDs, increasing
pointsLAT := array(SELECT latitude FROM points where location=name_input ORDER BY ID);
pointsLONG := array(SELECT longitude FROM points where location=name_input ORDER BY ID);
--Get the max/min lat/long to return before raycasting
select max(latitude) into maxLat from points where location=name_input;
select max(longitude) into maxLong from points where location=name_input;
select min(latitude) into minLat from points where location=name_input;
select min(longitude) into minLong from points where location=name_input;
--Check if it's even possible to be in the filter. If it's outside the bounds, return 0 for outside.
IF lat_in <= minLat OR lat_in >= maxLat OR long_in <= minLong OR long_in >= maxLong THEN
return 0;
END IF;
--Get the total number of points in the points array
SELECT COUNT(*) into numPoints from points where location=name_input;
--Init the pointer to the prev point index to the last guy in the array
j := numPoints;
--Perform raycast intersection test over the polgygon
for i IN 1..numPoints loop
--Test for intersection on an edge of the polygon
if((pointsLAT[i]>lat_in) != (pointsLAT[j]>lat_in)) then
if (long_in < (pointsLONG[j]-pointsLONG[i]) * (lat_in-pointsLAT[i]) / (pointsLAT[j]-pointsLAT[i]) + pointsLONG[i]) then
--Intersected a line, toggle in/out
if(inside = 0) then
inside := 1;
else
inside := 0;
end if;
end if;
end if;
--set J to previous before incrementing i
j := i;
end loop;
RETURN inside;
END; $$ LANGUAGE plpgsql IMMUTABLE;
I'm looking at finding a way to get a function index to work, because it's just too slow if I run it on a table with 200,000+ rows (about 40 seconds now with the optimizations provided so far in the answers). To compare, doing a select * of all the objects and running it through Java's Polygon class takes 2 seconds, so obviously I'm doing something wrong in my plsql implementation. I'm currently reading tutorials and I see things like inline functions and views to speed things up, but I'm not exactly sure what kind of thing to read into in order to make it faster.
Why have four statements?
select max(latitude), max(longitude), min(latitude), min(longitude)
into maxLat, maxlong, minLat, minLong
from points
where location = name_input;
This doesn't address why the call seems faster outside the function rather than inside. But there is other overhead to calling a function.
You can reduce all seven SQL statements into a single one:
select max(latitude),
max(longitude),
min(latitude),
min(longitude),
array_agg(latitude ORDER BY ID),
array_agg(longitude ORDER BY ID),
COUNT(*) over ()
into into maxLat, maxlong, minLat, minLong, pointsLAT, pointsLONG, numPoints
from points
where location = name_input;
I have no experience with GIS processing so I might be completely wrong with the following:
It seems that you are storing the polygon as multiple rows in the table. However in Postgres and even more so with the PostGIS extension you can store polygons in a single column and then you have native operators that can check if a point is inside the polygon. Queries using those operators can make use of GiST or GIN indexes.
My understanding is, that for any serious GIS work you should definitely look into PostGIS. The built-in geometric data types in Postgres only offer a very basic set of features.

Generate a random number of non duplicated random number in [0, 1001] through a loop

I need to generate a random number of non duplicated random number in plpgsql. The non duplicated number shall fall in the range of [1,1001]. However, the code generates number exceeding 1001.
directed2number := trunc(Random()*7+1);
counter := directed2number
while counter > 0
loop
to_point := trunc((random() * 1/directed2number - counter/directed2number + 1) * 1001 +1);
...
...
counter := counter - 1;
end loop;
If I understand right
You need a random number (1 to 8) of random numbers.
The random numbers span 1 to 1001.
The random numbers need to be unique. None shall appear more than once.
CREATE OR REPLACE FUNCTION x.unique_rand_1001()
RETURNS SETOF integer AS
$body$
DECLARE
nrnr int := trunc(random()*7+1); -- number of numbers
BEGIN
RETURN QUERY
SELECT (1000 * random())::integer + 1
FROM generate_series(1, nrnr*2)
GROUP BY 1
LIMIT nrnr;
END;
$body$ LANGUAGE plpgsql VOLATILE;
Call:
SELECT x.unique_rand_1001();
Numbers are made unique by the GROUP BY. I generate twice as many numbers as needed to provide enough numbers in case duplicates are removed. With the given dimensions of the task (max. 8 of 1001 numbers) it is astronomically unlikely that not enough numbers remain. Worst case scenario: viewer numbers are returned.
I wouldn't approach the problem that way in PostgreSQL.
From a software engineering point of view, I think I'd separate generating a random integer between x and y, generating 'n' of those integers, and guaranteeing the result is a set.
-- Returns a random integer in the interval [n, m].
-- Not rigorously tested. For rigorous testing, see Knuth, TAOCP vol 2.
CREATE OR REPLACE FUNCTION random_integer(integer, integer)
RETURNS integer AS
$BODY$
select cast(floor(random()*($2 - $1 +1)) + $1 as integer);
$BODY$
LANGUAGE sql VOLATILE
Then to select a single random integer between 1 and 1000,
select random_integer(1, 1000);
To select 100 random integers between 1 and 1000,
select random_integer(1, 1000)
from generate_series(1,100);
You can guarantee uniqueness in either application code or in the database. Ruby implements a Set class. Other languages have similar capabilities under various names.
One way to do this in the database uses a local temporary table. Erwin's right about the need to generate more integers than you need, to compensate for the removal of duplicates. This code generates 20, and selects the first 8 rows in the order they were inserted.
create local temp table unique_integers (
id serial primary key,
n integer unique
);
insert into unique_integers (n)
select random_integer(1, 1001) n
from generate_series(1, 20)
on conflict (n) do nothing;
select n
from unique_integers
order by id
fetch first 8 rows only;

Unexpected SQL results: string vs. direct SQL

Working SQL
The following code works as expected, returning two columns of data (a row number and a valid value):
sql_amounts := '
SELECT
row_number() OVER (ORDER BY taken)::integer,
avg( amount )::double precision
FROM
x_function( '|| id || ', 25 ) ca,
x_table m
WHERE
m.category_id = 1 AND
m.location_id = ca.id AND
extract( month from m.taken ) = 1 AND
extract( day from m.taken ) = 1
GROUP BY
m.taken
ORDER BY
m.taken';
FOR r, amount IN EXECUTE sql_amounts LOOP
SELECT array_append( v_row, r::integer ) INTO v_row;
SELECT array_append( v_amount, amount::double precision ) INTO v_amount;
END LOOP;
Non-Working SQL
The following code does not work as expected; the first column is a row number, the second column is NULL.
FOR r, amount IN
SELECT
row_number() OVER (ORDER BY taken)::integer,
avg( amount )::double precision
FROM
x_function( id, 25 ) ca,
x_table m
WHERE
m.category_id = 1 AND
m.location_id = ca.id AND
extract( month from m.taken ) = 1 AND
extract( day from m.taken ) = 1
GROUP BY
m.taken
ORDER BY
m.taken
LOOP
SELECT array_append( v_row, r::integer ) INTO v_row;
SELECT array_append( v_amount, amount::double precision ) INTO v_amount;
END LOOP;
Question
Why does the non-working code return a NULL value for the second column when the query itself returns two valid columns? (This question is mostly academic; if there is a way to express the query without resorting to wrapping it in a text string, that would be great to know.)
Full Code
http://pastebin.com/hgV8f8gL
Software
PostgreSQL 8.4
Thank you.
The two statements aren't strictly equivalent.
Assuming id = 4, the first one gets planned/prepared on each pass, and behaves like:
prepare dyn_stmt as '... x_function( 4, 25 ) ...'; execute dyn_stmt;
The other gets planned/prepared on the first pass only, and behaves more like:
prepare stc_stmt as '... x_function( $1, 25 ) ...'; execute stc_stmt(4);
(The loop will actually make it prepare a cursor for the above, but that's besides the point for our sake.)
A number of factors can make the two yield different results.
Search path changes before calling the procedure will be ignored by the second call. In particular if this makes x_table point to something different.
Constants of all kinds and calls to immutable functions are "hard-wired" in the second call's plan.
Consider this as an illustration of these side-effects:
deallocate all;
begin;
prepare good as select now();
prepare bad as select current_timestamp;
execute good; -- yields the current timestamp
execute bad; -- yields the current timestamp
commit;
execute good; -- yields the current timestamp
execute bad; -- yields the timestamp at which it was prepared
Why the two aren't returning the same results in your case would depend on the context (you only posted part of your pl/pgsql function, so it's hard to tell), but my guess is you're running into a variation of the above kind of problem.
From Tom Lane:
I think the problem is that you're assuming "amount" will refer to a table column of the query, when actually it's a local variable of the plpgsql function. The second interpretation will take precedence unless you qualify the column reference with the table's name/alias.
Note: PG 9.0 will throw an error by default when there is an ambiguity of this type.