PostgreSQL check if coordinate is inside a bounding box

PostgreSQL check if coordinate is inside a bounding box - postgresql

I have some locations I want to store on a database, the locations are defined by 4 coordinates p1(lat,long), p2(lat,long), p3(lat,long) and p4(lat,long), the only characteristic is that they always form a rectangle.
Once a few locations are stored in the DB I want to be able to query it, giving it a point(lat, long) and check if this point is inside any of the boxes in the DB.
My first question is what's the best way to design this table to make it easier and efficient to query it later. My first guess is something like this:
| id | lat1 | lon1 | lat2 | lon2 | lat3 | lon3 | lat4 | lon4 |
--------------------------------------------------------------
But I'm not sure what the best query to get all the locations or single location that another point is inside. for example the database contains 2 locations (rows)
| id | lat1 | lon1 | lat2 | lon2 | lat3 | lon3 | lat4 | lon4 |
--------------------------------------------------------------
| 1 | 0 | 0 | 0 | 10 | 10 | 10 | 10 | 0 |
| 2 | 50 | 50 | 50 | 60 | 60 | 60 | 60 | 50 |
If I have the point (5,5) how can I query the DB to get row 1?

You can use the least() and greatest() functions to get the min/max values for the x,y values (and maybe use these to construct two points and an enclosing rectangle)
CREATE TABLE latlon
( id INTEGER NOT NULL PRIMARY KEY
, lat1 INTEGER NOT NULL , lon1 INTEGER NOT NULL
, lat2 INTEGER NOT NULL , lon2 INTEGER NOT NULL
, lat3 INTEGER NOT NULL , lon3 INTEGER NOT NULL
, lat4 INTEGER NOT NULL , lon4 INTEGER NOT NULL
);
INSERT into LATLON ( id,lat1,lon1,lat2,lon2,lat3,lon3,lat4,lon4) VALUES
( 1 ,0 ,0 ,0 ,10 ,10 ,10 ,10 ,0 ),
( 2 ,50 ,50 ,50 ,60 ,60 ,60 ,60 ,50 );
SELECT id
, LEAST(lat1,lat2,lat3,lat4) AS MINLAT
, LEAST(lon1,lon2,lon3,lon4) AS MINLON
, GREATEST(lat1,lat2,lat3,lat4) AS MAXLAT
, GREATEST(lon1,lon2,lon3,lon4) AS MAXLON
FROM latlon;
Result:
CREATE TABLE
INSERT 0 2
id | minlat | minlon | maxlat | maxlon
----+--------+--------+--------+--------
1 | 0 | 0 | 10 | 10
2 | 50 | 50 | 60 | 60
(2 rows)
finding the (5,5) point:
SELECT * FROM (
SELECT id
, LEAST(lat1,lat2,lat3,lat4) AS minlat
, LEAST(lon1,lon2,lon3,lon4) AS minlon
, GREATEST(lat1,lat2,lat3,lat4) AS maxlat
, GREATEST(lon1,lon2,lon3,lon4) AS maxlon
FROM latlon
) rect
WHERE 5 >= rect.minlat AND 5 < rect.maxlat
AND 5 >= minlon AND 5 < rect.maxlon
;

Question 1: Are the rectangles horizontally aligned like your examples?
If they are, then it's enough to consider lat/long 1 and 3, for example, because you know that:
minLat = lat1
minLong = long1
maxLat = lat3
maxLong = long3
Question 2: Are lat and long 1 minor than lat and long 3 (are they ordered)?
If yes:
// You simply need to check that:
(lat1 <= latX <= lat3)
&& (long1 <= longX <= long3)
If not, you can previously check that lat1 <= lat3 and long1 <= long2 and switch them if needed.
Finally: If the answer to the question 1 was "not", then you can use the same principle, but first you need to apply some additional math. But, attending provided examples I suppose that it is not the case.
...anyway, If you are planning to check more complex cases (such as real -not only integers- coordinates) you probably should try using PostGIS...

Related

How can I use PostGIS to select the average price of the closest X locations?

I would like to find the average price of gas for any given home. Here are my current tables.
home_id | geocoordinates
1 | 0101000020E61000005BB6D617097544
2 | 0101000020E61000005BB6D617097545
3 | 0101000020E61000005BB6D617097546
4 | 0101000020E61000005BB6D617097547
5 | 0101000020E61000005BB6D617097548
gas_price | geocoordinates
1 | 0101000020E61000005BB6D617097544
1 | 0101000020E61000005BB6D617097545
1 | 0101000020E61000005BB6D617097546
2 | 0101000020E61000005BB6D617097547
2 | 0101000020E61000005BB6D617097548
2 | 0101000020E61000005BB6D617097544
2 | 0101000020E61000005BB6D617097545
3 | 0101000020E61000005BB6D617097546
3 | 0101000020E61000005BB6D617097547
3 | 0101000020E61000005BB6D617097548
3 | 0101000020E61000005BB6D617097544
4 | 0101000020E61000005BB6D617097545
4 | 0101000020E61000005BB6D617097546
4 | 0101000020E61000005BB6D617097547
For each home, I would like to find the average gas price of the X closest gas_prices. Example if X=5:
home_id | average_of_closest_five_gas_prices
1 | 1.5
2 | 2.5
3 | 2.1
4 | 1.5
5 | 1.5
I figured it out for using one individual home_id but I'm struggling to figure out how to do it for all.
select avg(gas_price) from (
SELECT *
FROM gas_price
ORDER BY gas_price.geocoordinates <-> '0101000020E61000005BB6D617097544'
LIMIT 5
) as table_a

You can use lateral join to limit size of group in group by.
select home_id, avg(gas_price)
from home,
lateral (
select gas_price
from gas_price
order by gas_price.geocoordinates <-> home.geocoordinates
limit 5
) x
group by home_id;
Another option is to use window function: partition by home_id, order by distance and select only rows with row_number() <= 5.
select home_id, avg(gas_price)
from (
select row_number() over w as r, *
from home h, gas_price g
window w as (partition by home_id order by g.geocoordinates <-> h.geocoordinates)
) x
where r <= 5
group by home_id;

Returning null individual values with postgres tablefunc crosstab()

I am trying to incorporate the null values within the returned lists, such that:
batch_id |test_name |test_value
-----------------------------------
10 | pH | 4.7
10 | Temp | 154
11 | pH | 4.8
11 | Temp | 152
12 | pH | 4.5
13 | Temp | 155
14 | pH | 4.9
14 | Temp | 152
15 | Temp | 149
16 | pH | 4.7
16 | Temp | 150
would return:
batch_id | pH |Temp
---------------------------------------
10 | 4.7 | 154
11 | 4.8 | 152
12 | 4.5 | <null>
13 | <null> | 155
14 | 4.9 | 152
15 | <null> | 149
16 | 4.7 | 150
However, it currently returns this:
batch_id | pH |Temp
---------------------------------------
10 | 4.7 | 154
11 | 4.8 | 152
12 | 4.5 | <null>
13 | 155 | <null>
14 | 4.9 | 152
15 | 149 | <null>
16 | 4.7 | 150
This is an extension of a prior question -
Can the categories in the postgres tablefunc crosstab() function be integers? - which led to this current query:
SELECT *
FROM crosstab('SELECT lab_tests_results.batch_id, lab_tests.test_name, lab_tests_results.test_result::FLOAT
FROM lab_tests_results, lab_tests
WHERE lab_tests.id=lab_tests_results.lab_test AND (lab_tests.test_name LIKE ''Test Name 1'' OR lab_tests.test_name LIKE ''Test Name 2'')
ORDER BY 1,2'
) AS final_result(batch_id VARCHAR, test_name_1 FLOAT, test_name_2 FLOAT);
I also know that I am not the first to ask this question generally, but I have yet to find a solution that works for these circumstances. For example, this one - How to include null values in `tablefunc` query in postgresql? - assumes the same Batch IDs each time. I do not want to specify the Batch IDs, but rather all that are available.
This leads into the other set of solutions I've found out there, which address a null list result from specified categories. Since I'm just taking what's already there, however, this isn't an issue. It's the null individual values causing the problem and resulting in a pivot table with values shifted to the left.
Any suggestions are much appreciated!
Edit: With Klin's help, got it sorted out. Something to note is that the VALUES section must match the actual lab_tests.test_name values you're after, such that:
SELECT *
FROM crosstab(
$$
SELECT lab_tests_results.batch_id, lab_tests.test_name, lab_tests_results.test_result::FLOAT
FROM lab_tests_results, lab_tests
WHERE lab_tests.id = lab_tests_results.lab_test
AND (
lab_tests_results.lab_test = 1
OR lab_tests_results.lab_test = 2
OR lab_tests_results.lab_test = 3
OR lab_tests_results.lab_test = 4
OR lab_tests_results.lab_test = 5
OR lab_tests_results.lab_test = 50 )
ORDER BY 1 DESC, 2
$$,
$$
VALUES('Mash pH'),
('Sparge pH'),
('Final Lauter pH'),
('Wort pH'),
('Wort FAN'),
('Original Gravity'),
('Mash Temperature')
$$
) AS final_result(batch_id VARCHAR,
ph_mash FLOAT,
ph_sparge FLOAT,
ph_final_lauter FLOAT,
ph_wort FLOAT,
FAN_wort FLOAT,
original_gravity FLOAT,
mash_temperature FLOAT)
Thanks for the help!

Use the second form of the function:
crosstab(text source_sql, text category_sql) - Produces a “pivot table” with the value columns specified by a second query.
E.g.:
SELECT *
FROM crosstab(
$$
SELECT lab_tests_results.batch_id, lab_tests.test_name, lab_tests_results.test_result::FLOAT
FROM lab_tests_results, lab_tests
WHERE lab_tests.id=lab_tests_results.lab_test
AND (
lab_tests.test_name LIKE 'Test Name 1'
OR lab_tests.test_name LIKE 'Test Name 2')
ORDER BY 1,2
$$,
$$
VALUES('pH'), ('Temp')
$$
) AS final_result(batch_id VARCHAR, "pH" FLOAT, "Temp" FLOAT);

How many intersects are in the table?

longitude | latitude
----------+---------
1 | 2
2 | 3
4 | 5
2 | 3
5 | 6
1 | 2
How can I find how many intersects points are on the table? In this case 1,2 e 2,3
SELECT ST_Intersects

E.g:
select longitude, latidude, count(0) intersects from
table_name group by longitude, latidude having count(0) > 1

How would you read a csv in a stored procedure such that the csv needs data extraction?

The csv has urls of images in the format -
www.domain.com/table_id/x_y_height_width.jpg
We want to extract table_id, x, y, height and width from these urls in a stored procedure and then use these parameters in multiple sql queries.
How can we do that?

regexp_split_to_array and split_part functions
create or replace function split_url (
_url text, out table_id int, out x int, out y int, out height int, out width int
) as $$
select
a[2]::int,
split_part(a[3], '_', 1)::int,
split_part(a[3], '_', 2)::int,
split_part(a[3], '_', 3)::int,
split_part(split_part(a[3], '_', 4), '.', 1)::int
from (values
(regexp_split_to_array(_url, '/'))
) rsa(a);
$$ language sql immutable;
select *
from split_url('www.domain.com/234/34_12_400_300.jpg');
table_id | x | y | height | width
----------+----+----+--------+-------
234 | 34 | 12 | 400 | 300
To use the function with other tables do lateral:
with t (url) as ( values
('www.domain.com/234/34_12_400_300.jpg'),
('www.examplo.com/984/12_90_250_360.jpg')
)
select *
from
t
cross join lateral
split_url(url)
;
url | table_id | x | y | height | width
---------------------------------------+----------+----+----+--------+-------
www.domain.com/234/34_12_400_300.jpg | 234 | 34 | 12 | 400 | 300
www.examplo.com/984/12_90_250_360.jpg | 984 | 12 | 90 | 250 | 360

Sql Query to select missing records based on multiple hard coded ranges

Creating a SQL query that performs math with variables from multiple tables
This question I asked previously will help a bit as far as layout, for the sake of saving time I'll include the important bits and add in more detail for scenarios pertaining to this:
A mockup of what the tables look like:
Inventory
ID | lowrange | highrange | ItemType
----------------------------------------
1 | 15 | 20 | 1
2 | 21 | 30 | 1
3 | null | null | 1
4 | 100 | 105 | 2
MissingOrVoid
ID | Item | Missing | Void
---------------------------------
1 | 17 | 1 | 0
1 | 19 | 1 | 0
4 | 102 | 0 | 1
4 | 103 | 1 | 0
4 | 104 | 1 | 0
TableWithDataEnteredForItemType1
InventoryID| ItemID | Detail1 | Detail2 | Detail3
-------------------------------------------------
1 | 16 | Some | Info | Here
1 | 18 | More | Info | Here
1 | 20 | Data | Is | Here
2 | 21 | .... | .... | ....
2 | 24 | .... | .... | ....
2 | 28 | .... | .... | ....
2 | 29 | .... | .... | ....
2 | 30 | .... | .... | ....
TableWithDataEnteredForItemType2
InventoryID| ItemID | Col1 | Col2 | Col3
----------------------------------------
4 | 101 | .... | .... | ....
I attempted this. I know it is not functional but it illustrates what I'm trying to do and I personally haven't seen anything written up like this before:
SELECT CASE WHEN (I.ItemType = 1) THEN SELECT TONE.ItemID FROM
TableWithDataEnteredForItemType1 TONE WHEN (I.ItemType = 2)
THEN SELECT TTWO.ItemID FROM TableWithDataEnteredForItemType2
TTWO END AS ItemMissing Inventory I JOIN CASE WHEN (I.ItemType = 1) THEN
TableWithDataEnteredForItem1 T WHEN (I.ItemType = 2) THEN
TableWithDataEnteredForItem2 T END ON
I.ID = T.InventoryID WHERE ItemMissing NOT BETWEEN IN (SELECT
I.lowrange FROM Inventory WHERE I.lowrange IS NOT NULL) AND IN
(SELECT I.highrange FROM Inventory WHERE I.highrange IS NOT NULL)
AND ItemMissing NOT IN (SELECT Item from MissingOrVoid)
The result should be:
ItemMissing
----
15
22
23
25
26
27
105
I know I'm probably not even going in the right direction with my query, but I was hoping I could get some direction as to fixing it to get the results that are needed.
Thanks
Edit:
Specific requirements (thought I included this but appears I didn't) - return list of all items not accounted for in the system. There are two ways of something being accounted for: 1) an entry in the corresponding ItemType table 2) located in MissingOrVoid (known items to be missing or removed).
DDL (I had to look this up as I wasn't sure what was meant here. Being that I have very little experience creating tables by scripting, this will probably be psuedo-DDL):
Inventory
ID - int, identifier/primary key, not nullable
lowrange - int, nullable
highrange - int, nullable
itemtype - int, not nullable
MissingOrVoid
ID - int, foreign key for Inventory.ID, not nullable
Item - int, identifier/primary key, not nullable
missing - bit, not nullable
void - bit, not nullable
Tables for Item types:
IntenvoryID - int, foreign key for Inventory.ID, not nullable
ItemID - int, primary key, not nullable
Everything else - not needed for querying, just data about the item (included
to show that the tables aren't the same content)
Edit 2: Here's an incredibly inefficient C# and Linq way of doing it but maybe of some help:
List<int> Items = new List<int>();
List<int> MoV = (from c in db.MissingOrVoid Select c.Item).ToList();
foreach (Table...ItemType1 row in db.Table...ItemType1)
Items.Add(row.ItemID);
foreach (Table...ItemType2 row in db.Table...ItemType2)
Items.Add(row.ItemID);
List<Range> InventoryRanges = new List<Range>();
foreach (Inventory row in db.Inventories)
{
if (row.lowrange != null && row.highrange != null)
InventoryRanges.Add(new Range(row.lowrange, row.highrange));
}
foreach (int item in Items)
{
foreach (Range range in InventoryRanges)
{
if (range.lowrange <= item && range.highrange >= item)
Items.Remove(item);
}
if (MoV.Contains(item))
Items.Remove(item);
}
return Items;

There's a ready made number table called master..spt_values, which can be quite helpful in this case. Note though, that you can use this table if the distance between lowrange and highrange cannot exceed 2047, otherwise create, populate and use your own number table instead.
Here's the method:
SELECT
ItemMissing = i.Item
FROM (
SELECT
i.ID,
Item = i.lowrange + v.number,
i.ItemType
FROM Inventory i
INNER JOIN master..spt_values v
ON v.type = 'P' AND v.number BETWEEN 0 AND i.highrange - i.lowrange
) inv
LEFT JOIN MissingOrViod m
ON inv.ID = m.ID AND inv.Item = m.Item
LEFT JOIN TableWithDataEnteredForItem1 t1 ON inv.ItemType = 1
AND inv.ID = t1.InventoryID AND inv.Item = t1.ItemID
LEFT JOIN TableWithDataEnteredForItem2 t2 ON inv.ItemType = 2
AND inv.ID = t2.InventoryID AND inv.Item = t2.ItemID
WHERE m.ID IS NULL AND t1.InventoryID IS NULL AND t2.InventoryID IS NULL
The subselect expands the Inventory table into a complete item list with item IDs as defined by lowrange and highrange (this is where the number table comes in handy). The obtained list is then compared against the other three tables to find and exclude those items that are present in them. The remaining items, then, constitute the list of 'items missing'.