Querying in postgres for a range around an integer - postgresql

In postgresql, I'd like to determine whether a known integer is within a +/- range of another integer. What is the function for this query?
Example: In my dataset, I have 2 Tables:
Table_1
ID integer
1 2000
2 3000
3 4000
Table_2
ID integer
1 1995
2 3050
3 4100
For each ID-pair, I'd like to query whether Table_1.integer is +/- 25 of Table_2.integer.
The answers would be:
ID 1: TRUE
ID 2: FALSE
ID 3: FALSE
Any help is much appreciated. I am new to using postgresql and all programming languages in general.

We can try checking the absolute value of the difference between the two integer values, for each ID:
SELECT
t1.ID,
CASE WHEN ABS(t1.integer - t2.integer) <= 25 THEN 'TRUE' ELSE 'FALSE' END AS answer
FROM Table_1 t1
INNER JOIN Table_2 t2
ON t1.ID = t2.ID
ORDER BY
t1.ID;
Demo
If you want to just output the raw boolean value, then use:
SELECT
t1.ID,
ABS(t1.integer - t2.integer) <= 25 AS answer
FROM ...

This is almost similar to #Tim's solution but without the CASE expression, useful if you wish to output boolean types.
SELECT t1.ID,ABS(t1.integer - t2.integer) <= 25 as res
FROM table_1 t1 JOIN table_2 t2
ON t1.ID = t2.ID;
DEMO

Related

Express Nearest Neighbor Join in Postgresql?

I have two tables Q and T, both containing a column of float numbers.
What I want to do is, for each number in Q, I want to find a number in T that has the smallest distance to it.
For example, for T={1,7,9} and Q={2,6,10}, I want to return Q,T pairs as {(2,1),(6,7),(10,9)}.
How should I express this query with SQL?
In addition, is that possible to accelerate this join by index, e.g. add an operator class which bind "FOR ORDER BY <->" with fabs calculation?
create table t (val_t integer);
create table q (val_q integer);
insert into t values (1),(7),(9);
insert into q values (2),(6),(10);
Start with a query that cross joins the two tables and adds a rank based on the difference:
SELECT val_q, val_t, rank() OVER (PARTITION BY val_q ORDER BY abs(val_t - val_q))
FROM t
JOIN q ON true ;
Use this query in a cte or subquery and filter by rank:
WITH src AS(
SELECT val_q, val_t, rank() OVER (PARTITION BY val_q ORDER BY abs(val_t - val_q))
FROM t
JOIN q ON true )
SELECT val_q, val_t FROM src
WHERE rank = 1;
val_q | val_t
-------+-------
2 | 1
6 | 7
10 | 9
See https://www.postgresql.org/docs/12/tutorial-window.html
Given this schema:
create table t (tn float);
insert into t values (1), (7), (9);
create table q (qn float);
insert into q values (2), (6), (10);
DISTINCT ON is the most straightforward way:
select distinct on (qn) qn, tn
from q
cross join t
order by qn, abs(qn - tn);
Exploiting a numeric range may perform better depending on your data sizes. If performance is an issue, then you can create an actual temp table for the range_tn CTE and put a gist index on it:
with all_tn as (
select tn
from t
union select null
), range_tn as (
select numrange(tn::numeric, (lead(tn) over w)::numeric, '[]') as tr
from all_tn
window w as (order by tn nulls first)
)
select qn,
case
when lower_inf(tr) then upper(tr)
when upper_inf(tr) then lower(tr)
when 2 * qn - lower(tr) - upper(tr) > 0 then upper(tr)
else lower(tr)
end as tn
from q
join range_tn
on qn::numeric <# tr;
Fiddle here

Return the most recent value when joining two tables

I am trying to join two tables and return the most recent value for a field.
Currently, if aa.time_day does not equal bb.time, then the bb.time field returns null. I would like this to return the most recent value less than or equal to the aa.time_date value.
My query currently looks like this:
Select
aa.day_time
aa.name
aa.value
bb.name
bb.target_value
bb.time
FROM
x.table1 aa LEFT JOIN y.table2 bb
ON aa.name = bb.name AND aa.day_time=bb.time
WHERE aa.day_time = TO_DATE(‘01/01/2017’,’DD/MM/YYYY’)
Searching Stackoverflow and other websites showed me a number of solutions, unfortunately nothing worked. The query below is the closest I got to success as it did not throw up an error message, however it ran for several hours and I had to stop it. The query above worked in about 5 seconds.
Select
aa.day_time
aa.name
aa.value
bb.name
bb.target_value
bb.time
FROM
x.table1 aa LEFT JOIN y.table2 bb
ON aa.name = bb.name AND aa.day_time=
(SELECT MAX (bb.time)
FROM y.table2
WHERE bb.time <= aa.day_time)
WHERE a.day_time = TO_DATE(‘01/01/2017’,’DD/MM/YYYY’)
I'm not familiar with SQL, so thank you very much for your help in advance.
If this is oracle and there is a one to many relationship t1 to t2 then a using a cte to find the most recent date from t2 might do it
DROP TABLE T1;
DROP TABLE T2;
CREATE TABLE T1(DAY_TIME DATE,NAME VARCHAR(3), VALUE NUMBER);
CREATE TABLE T2(DAY_TIME DATE,NAME VARCHAR(3), VALUE NUMBER);
TRUNCATE TABLE T1;
INSERT INTO T1(day_time,NAME,VALUE) VALUES (to_date('2018-01-01','YYYY-MM-DD'),'aaa',10);
TRUNCATE TABLE T2;
INSERT INTO T2(day_time,NAME,VALUE) VALUES (to_date('2017-01-01','YYYY-MM-DD'),'aaa',10);
INSERT INTO T2(day_time,NAME,VALUE) VALUES (to_date('2018-02-01','YYYY-MM-DD'),'aaa',10);
SELECT * FROM T1;
SELECT * FROM T2;
WITH cte AS
(
select name,day_time,value
from T2
where T2.day_time = (select MAX(t3.DAY_TIME) FROM T2 t3 WHERE t3.DAY_TIME <= TO_DATE('2018-01-01','YYYY-MM-DD') and t3.name = t2.name)
)
SELECT t1.name,t1.day_time,t1.value,
cte.name,cte.day_time,cte.value
from t1
left join cte on t1.name = cte.name
where t1.day_time = to_date('2018-01-01','YYYY-MM-DD');
NAME DAY_TIME VALUE NAME DAY_TIME VALUE
---- ---------------------- ---------- ---- ---------------------- ----------
aaa 01-JAN-2018 00:00:00 10 aaa 01-JAN-2017 00:00:00 10
If there are no entries at all in t2 then the t2 side of the select will by empty.

to write a SQL query which select rows where column value changed from previous row

CREATE TABLE status( id serial NOT NULL,
id integer,
plan smallint,
ime timestamp without time zone
CONSTRAINT data_pkey PRIMARY KEY (id))
WITH (OIDS=FALSE);
ALTER TABLE data
OWNER TO postgres;
Index: data_idx
CREATE INDEX data_idx
ON data
USING btree
(time, id);
I have a table like this
id val plan time
1 8300 1 2011-01-01
2 8300 1 2011-01-02
3 8300 2 2011-01-03
4 9600 1 2011-01-04
5 9600 2 2011-01-05
How do I select the rows where sigplan changed from the previous row for that siteId?
In the example above, the query should return the rows
2011-01-03 (sigplan changed from 1 to 2 between 2011-01-01 and 2011-01-03 for 8300),
2011-01-05(sigplan changed from 1 to 2 between 2011-01-04 and 2011-01-05 for 9600).
The table contains lot of data so the query should be optimized.
SELECT siteId, sigplan, MAX(server_time) FROM traffview.status_data
GROUP BY siteId, sigplan
HAVING COUNT(1) > 1 AND MAX(server_time) > 'XXXXX' AND MAX(server_time) < 'XXXXX'
The annoying part is figuring out which is the previous row id with the same siteId. After that it is pretty easy by joining the table with itself.
SELECT t1.* FROM table t1, table t2
WHERE t1.sigplan != t2.sigplan
AND t2.id = (SELECT MAX(t3.id) FROM table t3 WHERE t3.id < t1.id)
If the table is moderately (not extremely) large I would consider doing this in application code instead, or by storing the change flag in its own column when writing a new row. A subquery for each row in the table has very poor performance.
This version doesn't have a sub-query, but does assume that you have consecutive IDs.
SELECT t1.*
FROM traffview AS t1, traffview AS t2
WHERE
t1.siteId = t2.siteId
AND t1.sigplan <> t2.sigplan
AND t1.id - t2.id = 1
ORDER BY
t1.server_time
In case you compare with previous rows it is useful to use LAG function which does the job for you:
SELECT sub.*
FROM (
SELECT
plan AS curr_plan,
LAG(plan) OVER (PARTITION BY val ORDER BY time) AS prev_plan,
val,
time
) sub
WHERE
sub.prev_plan IS NOT NULL AND sub.prev_plan <> sub.curr_plan;

Postgresql running sum of previous groups?

Given the following data:
sequence | amount
1 100000
1 20000
2 10000
2 10000
I'd like to write a sql query that gives me the sum of the current sequence, plus the sum of the previous sequence. Like so:
sequence | current | previous
1 120000 0
2 20000 120000
I know the solution likely involves windowing functions but I'm not too sure how to implement it without subqueries.
SQL Fiddle
select
seq,
amount,
lag(amount::int, 1, 0) over(order by seq) as previous
from (
select seq, sum(amount) as amount
from sa
group by seq
) s
order by seq
If your sequence is "sequencial" without holes you can simply do:
SELECT t1.sequence,
SUM(t1.amount),
(SELECT SUM(t2.amount) from mytable t2 WHERE t2.sequence = t1.sequence - 1)
FROM mytable t1
GROUP BY t1.sequence
ORDER BY t1.sequence
Otherwise, instead of t2.sequence = t1.sequence - 1 you could do:
SELECT t1.sequence,
SUM(t1.amount),
(SELECT SUM(t2.amount)
from mytable t2
WHERE t2.sequence = (SELECT MAX(t3.sequence)
FROM mytable t3
WHERE t3.sequence < t1.sequence))
FROM mytable t1
GROUP BY t1.sequence
ORDER BY t1.sequence;
You can see both approaches in this fiddle

How to replace a column values when querying with left join command SQL Server 2008 r2

Alright let me explain my question with example
We have a table that contains
Id
Name
Number
Now example
1 House 4
2 Hospital 3
3 Airport 'null'
4 Station 2
select t1.id,
t1.name,
t2.name as name2
from your_table t1
left join your_table t2 on t1.number = t2.id
Ok when querying as the above, that 'null' value containing column is giving error. So i want to modify above query in a way that it will return name2 as null and won't give error for that rows.
So the result I expect should be:
1 House Station
2 Hospital Airport
3 Airport null
4 Station Hospital
This null here is as string.
The current error I get
Msg 245, Level 16, State 1, Line 5
Conversion failed when converting the varchar value 'null' to data type smallint.
thank you
You should fix your database design. Meantime, use NULLIF to get your expected results:
select t1.id,
t1.name,
t2.name as name2
from your_table t1
left join your_table t2 on NULLIF( t1.number, 'NULL' ) = t2.id