Postgres - Insert nearest neighbour distance into another table - postgresql

So I have three tables (A, B, C). In tables A and B I have points, and I want to insert into C each row from A, and some columns from the closest point from B to each point in A, as well as the distance between them. I know that the query to get the nearest neighbour is this:
SELECT DISTINCT ON (A.id5) A.state, B.way, st_distance (A.geom,B.geom) INTO C
FROM A, B
WHERE ST_DWithin(A.geom, B.geom, 150)
ORDER BY A.objectid, ST_Distance(A.geom,A.geom)
But I need to get that into a bigger INSERT query, and I tried to do it this way:
INSERT INTO complete(id_door, distance, id_way,Y, X, geom, check)
(SELECT A.state, (select distinct on (A.id5) ST_DISTANCE(A.geom,B.geom) from A order by A.id5, st_distance(A.geom,B.geom)), b.way, ST_Y(B.geom), ST_X(B.geom) ,B.geom, V.check
FROM A, B, C, V
WHERE
ST_INTERSECTS(A.geom, V.geom)\
AND ST_DWithin(A.geom, B.geom,150))
But this is not the right way, because I get the error:
psycopg2.ProgrammingError: more than one row returned by a subquery used as an expression
I cannot copy all the distances from A and B to C and then delete all but the closest because it is a huge table and I would run out of memory, so I need a way to only insert the rows with the info from the closest point from B to A.
What am I doing wrong here? Thank you in advance
UPDATE:
After some help, I have learned that I should use a Lateral in the Select query, but I'm not sure how to use it.
I need the Select to get each row in table A and find its nearest neighbour from table B, which I guess it is done using the query previously stated, and insert into table C some columns from A, some columns from its nearest neighbour (table B), and some columns from table V, which is selected by an Intersect condition. The main problem is how to organize all that into the Select so I don't get an error.
This is where I am at this point:
INSERT INTO C (id_door, distance, id_way,Y, X, geom, check)
(SELECT A.state, l.*, V.check
FROM A, B, C, V
lateral (select st_distance(a.geom,b.geom), b.way, ST_Y(B.geom), ST_X(B.geom) ,B.geom
From B
Where ST_DWithin(a.geom, b.geom,150))
Order by a.geom<->b.geom limit 1) l
WHERE
ST_INTERSECTS(A.geom, V.geom)

You can use lateral join - very smart type of subquery that can reference tables outside the subquery. More about lateral you can find here
-- Edited according to new information in answer --
Insert into C (id_door, distance, id_way,Y, X, geom, check)
select l.*
from a,
lateral (select a.state, st_distance(a.geom,b.geom),
b.way, ST_Y(B.geom), ST_X(B.geom), B.geom,
v.check
from b, v
where ST_DWithin(a.geom, b.geom,150)
and st_dwithin(a.geom,v.geom,0)
and st_intersects(a.geom,v.geom)
order by a.geom<->b.geom, v.geom limit 1) l
If you want more records per each point from A then increase the limit from 1 to your desired value.

Related

Updating point by nearest neighbor 3Ddistance

A follow on from this question: ST_3DClosestPoint returning multiple points
1) I have a xyz target point stored as a geom, I want to update the row with the 3Ddistance to the nearest obs point in another table. So find the nearest obs point and then update the target point with the distance.
expected result for the target point:
id|geom|3Ddistance_To_Nearest_Point_In_Obs_Table
obs table:
id|geom
e.g. 100 records
2) To complicate matters, I also want to select n-neighbours from the obs table (lets say 10 for example) and calculate the average distance and update the target table.
expected target result:
id|geom|average_3Ddistance
I've been trying to alter the former example, but no joy, any ideas?
Thanks
If collections are static then you can CTAS (create table as select) your results instead of updating it.
create table new_table as
select t2.id, t2.geom, min(3ddistance) min_3DDistance, avg(3ddistance) avg_3ddistance
from target t2,
lateral (select t.id, st_3ddistance(o.geom, t.geom) 3ddistance
from obs o, target t
where t2.id=t.id
order by st_3ddistance(o.geom, t.geom) limit 10) a
group by t2.id, t2.geom;
or if you want to update
update target
set (average_3ddistance, min_3ddistance)=(
from (select id, min(3ddistance) min_3DDistance, avg(3ddistance) avg_3ddistance
from (select t.id, st_3ddistance(o.geom, t.geom) 3ddistance
from obs o, target t
where t2.id=t.id
order by st_3ddistance(o.geom, t.geom) limit 10) a
group by id) b
where b.id=t.id;

Product of regular field and computed field in an ORDER BY clause

The following query works fine:
SELECT a, b, c,
(SELECT COUNT(*) AS COUNT FROM table_b WHERE field_a = p.a) AS d
FROM table_a AS p
ORDER BY b DESC
And this also works:
SELECT a, b, c,
(SELECT COUNT(*) AS COUNT FROM table_b WHERE field_a = p.a) AS d
FROM table_a AS p
ORDER BY d DESC
But the following produces a ERROR; column 'd' does not exist error:
SELECT a, b, c,
(SELECT COUNT(*) AS COUNT FROM table_b WHERE field_a = p.a) AS d
FROM table_a AS p
ORDER BY (b * d) DESC
Only difference between the three queries above is the ORDER BY clause. In the first two queries, results are ordered by either the b field or by the dynamic d field. In the last query, results are (should be) ordered by the product of b times d.
How comes, in the last query, PostgreSQL says that d does not exist while it can find it without issue in the second query?
Arbitrary expressions in the order by clause can only be formed from input columns:
Each expression can be the name or ordinal number of an output column (SELECT list item), or it can be an arbitrary expression formed from input-column values.
You will need to subquery it:
select *
from (select 1 as a, 2 as b) s
order by a * b

IN with more than one column is possible?

I have this SQL
Select A, B, D from T
Where not exists (select F, S from Z)
Can I do the same thing using NOT IN for more than one column ?
Some databases (eg postgresql) have support for row values, where you can do something like ROW(x, y) NOT IN (select f, s from z), Firebird unfortunately does not have row values, so you cannot have more than one column in an IN (or NOT IN).
However you can usually emulate it with a correlated subquery in the exists, eg:
SELECT A, B, D
FROM T
WHERE NOT EXISTS (
SELECT 1
FROM Z
WHERE Z.F = T.X AND Z.S = T.Y
)
Note that EXISTS doesn't care about the selected values, but just that one or more rows were produced, so the two columns you use in your select within the EXISTS are not relevant (it could just as well have been 1 as I used above).

Add a new column dynamically in resultset

I have a query as below;
Existing Query - select A,B,C from table1.
Table2 has columns X,Y
The new query should have a new column(D) in the result-set. the value of D will be calculated based on column X.
D's calculation should be D = (C * X), Here to decide the row of column X from table2 -Y can be used in where condition. Y & A are not same but similar
I did not understand what did you mean by "Y & A are not same but similar". I assume Y and A can be used as joining keys. If so, the anwer would be:
SELECT T1.A,T1.B,T1.C,T1.C*T2.X AS D
FROM Table1 T1
JOIN Table2 T2 ON T1.A=T2.Y
I hope this helps!

Browse two tables using a cursor

In my procedure I have two tables with the same data. I go through my first table through a cursor. Which compares with the second table that I find much the same data. What if, for example in my table_1 I have ten in my data and I have 12 data table2 how to detect missing data in my two table_1 which is traversed by the cursor?
Thx.
Sounds very much like you'd be better off using the MINUS operator.
SELECT a, b, c
FROM table1
MINUS
SELECT a, b, c
FROM table2
This will show you all results that exist in table1 which are not present in table2. In order to show discrepancies both ways, you could do something like this:
SELECT z.*, 'In table1, not in table2' problem_description
FROM (
SELECT a, b, c
FROM table1
MINUS
SELECT a, b, c
FROM table2
) z
UNION ALL
SELECT z.*, 'In table2, not in table1' problem_description
FROM (
SELECT a, b, c
FROM table2
MINUS
SELECT a, b, c
FROM table1
) z
SQL Fiddle for this answer