How many intersects are in the table? - postgresql

longitude | latitude
----------+---------
1 | 2
2 | 3
4 | 5
2 | 3
5 | 6
1 | 2
How can I find how many intersects points are on the table? In this case 1,2 e 2,3
SELECT ST_Intersects

E.g:
select longitude, latidude, count(0) intersects from
table_name group by longitude, latidude having count(0) > 1

Related

PostgresQL for each row, generate new rows and merge

I have a table called example that looks as follows:
ID | MIN | MAX |
1 | 1 | 5 |
2 | 34 | 38 |
I need to take each ID and loop from it's min to max, incrementing by 2 and thus get the following WITHOUT using INSERT statements, thus in a SELECT:
ID | INDEX | VALUE
1 | 1 | 1
1 | 2 | 3
1 | 3 | 5
2 | 1 | 34
2 | 2 | 36
2 | 3 | 38
Any ideas of how to do this?
The set-returning function generate_series does exactly that:
SELECT
id,
generate_series(1, (max-min)/2+1) AS index,
generate_series(min, max, 2) AS value
FROM
example;
(online demo)
The index can alternatively be generated with RANK() (example, see also #a_horse_­with_­no_­name's answer) if you don't want to rely on the parallel sets.
Use generate_series() to generate the numbers and a window function to calculate the index:
select e.id,
row_number() over (partition by e.id order by g.value) as index,
g.value
from example e
cross join generate_series(e.min, e.max, 2) as g(value);

How can I use PostGIS to select the average price of the closest X locations?

I would like to find the average price of gas for any given home. Here are my current tables.
home_id | geocoordinates
1 | 0101000020E61000005BB6D617097544
2 | 0101000020E61000005BB6D617097545
3 | 0101000020E61000005BB6D617097546
4 | 0101000020E61000005BB6D617097547
5 | 0101000020E61000005BB6D617097548
gas_price | geocoordinates
1 | 0101000020E61000005BB6D617097544
1 | 0101000020E61000005BB6D617097545
1 | 0101000020E61000005BB6D617097546
2 | 0101000020E61000005BB6D617097547
2 | 0101000020E61000005BB6D617097548
2 | 0101000020E61000005BB6D617097544
2 | 0101000020E61000005BB6D617097545
3 | 0101000020E61000005BB6D617097546
3 | 0101000020E61000005BB6D617097547
3 | 0101000020E61000005BB6D617097548
3 | 0101000020E61000005BB6D617097544
4 | 0101000020E61000005BB6D617097545
4 | 0101000020E61000005BB6D617097546
4 | 0101000020E61000005BB6D617097547
For each home, I would like to find the average gas price of the X closest gas_prices. Example if X=5:
home_id | average_of_closest_five_gas_prices
1 | 1.5
2 | 2.5
3 | 2.1
4 | 1.5
5 | 1.5
I figured it out for using one individual home_id but I'm struggling to figure out how to do it for all.
select avg(gas_price) from (
SELECT *
FROM gas_price
ORDER BY gas_price.geocoordinates <-> '0101000020E61000005BB6D617097544'
LIMIT 5
) as table_a
You can use lateral join to limit size of group in group by.
select home_id, avg(gas_price)
from home,
lateral (
select gas_price
from gas_price
order by gas_price.geocoordinates <-> home.geocoordinates
limit 5
) x
group by home_id;
Another option is to use window function: partition by home_id, order by distance and select only rows with row_number() <= 5.
select home_id, avg(gas_price)
from (
select row_number() over w as r, *
from home h, gas_price g
window w as (partition by home_id order by g.geocoordinates <-> h.geocoordinates)
) x
where r <= 5
group by home_id;

How would you create a group identifier based on one column, but sorted by another?

I am attempting to create column Group via T-SQL.
If a cluster of accounts are in a row, consider that as one group. if the account is seen again lower in the list (cluster or not), then consider it a new group. This seems straight forward, but I cannot seem to see the solution... Below there are three clusters of account 3456, each having a different group number (Group 1,4, and 6)
+-------+---------+------+
| Group | Account | Sort |
+-------+---------+------+
| 1 | 3456 | 1 |
| 1 | 3456 | 2 |
| 2 | 9878 | 3 |
| 3 | 5679 | 4 |
| 4 | 3456 | 5 |
| 4 | 3456 | 6 |
| 4 | 3456 | 7 |
| 5 | 1295 | 8 |
| 6 | 3456 | 9 |
+-------+---------+------+
UPDATE: I left this out of the original requirements, but a cluster of accounts could have more than two accounts. I updated the example data to include this scenario.
Here's how I'd do it:
--Sample Data
DECLARE #table TABLE (Account INT, Sort INT);
INSERT #table
VALUES (3456,1),(3456,2),(9878,3),(5679,4),(3456,5),(3456,6),(1295,7),(3456,8);
--Solution
SELECT [Group] = DENSE_RANK() OVER (ORDER BY grouper.groupID), grouper.Account, grouper.Sort
FROM
(
SELECT t.*, groupID = ROW_NUMBER() OVER (ORDER BY t.sort) +
CASE t.Account WHEN LEAD(t.Account,1) OVER (ORDER BY t.sort) THEN 1 ELSE 0 END
FROM #table AS t
) AS grouper;
Results:
Group Account Sort
------- ----------- -----------
1 3456 1
1 3456 2
2 9878 3
3 5679 4
4 3456 5
4 3456 6
5 1295 7
6 3456 8
Update based on OPs comment below (20190508)
I spent a couple days banging my head on how to handle groups of three or more; it was surprisingly difficult but what I came up with handles bigger clusters and is way better than my first answer. I updated the sample data to include bigger clusters.
Note that I include a UNIQUE constraint for the sort column - this creates a unique index. You don't need the constraint for this solution to work but, having an index on that column (clustered, nonclustered unique or just nonclustered) will improve the performance dramatically.
--Sample Data
DECLARE #table TABLE (Account INT, Sort INT UNIQUE);
INSERT #table
VALUES (3456,1),(3456,2),(9878,3),(5679,4),(3456,5),(3456,6),(1295,7),(1295,8),(1295,9),(1295,10),(3456,11);
-- Better solution
WITH Groups AS
(
SELECT t.*, Grouper =
CASE t.Account WHEN LAG(t.Account,1,t.Account) OVER (ORDER BY t.Sort) THEN 0 ELSE 1 END
FROM #table AS t
)
SELECT [Group] = SUM(sg.Grouper) OVER (ORDER BY sg.Sort)+1, sg.Account, sg.Sort
FROM Groups AS sg;
Results:
Group Account Sort
----------- ----------- -----------
1 3456 1
1 3456 2
2 9878 3
3 5679 4
4 3456 5
4 3456 6
5 1295 7
5 1295 8
5 1295 9
5 1295 10
6 3456 11

PostgreSQL WITH RECURSIVE query to get ordered parent-child chain by a Partition Key

I have the issue writing a sql script on PostgreSQL 9.6.6 which orders steps in a process by using the steps' parent-child ID's, and this grouped/partitioned per process ID. I couldn't find this special case here, so I apologize if I missed it and would please you to provide me the link to the solution in the comments.
The case: I have a table which looks like this:
processID | stepID | parentID
1 1 NULL
1 3 5
1 2 4
1 4 3
1 5 1
2 1 NULL
2 3 5
2 2 4
2 4 3
2 5 1
Now I have to order the steps by starting with the step where parentID is NULL for each processID .
Note: I cannot simply order StepID or parentID as new steps I put within the whole process get a higher stepID then the last step in the process (continuous generating surrogate key).
I have to order the steps for every processID, that I will receive the following output:
processID | stepID | parentID
1 1 NULL
1 5 1
1 3 5
1 4 3
1 2 4
2 1 NULL
2 5 1
2 3 5
2 4 3
2 2 4
I tried to do this with the CTE function WITH RECURSIVE:
WITH RECURSIVE
starting (processID,stepID, parentID) AS
(
SELECT b.processID,b.stepID, b.parentID
FROM process b
WHERE b.parentID ISNULL
),
descendants (processID,stepID, parentID) AS
(
SELECT b.processID,b.stepID, b.stepparentID
FROM starting b
UNION ALL
SELECT b.processID,b.stepID, b.parentID
FROM process b
JOIN descendants AS c ON b.parentID = c.stepID
)
SELECT * FROM descendants
The result is not what I am searching for. As we have hundreds of processes, I receive a list where the first records are the different processIDs which have a NULL value as parentID.
I guess I have to recursive the whole script on the processID again, but have no idea how.
Thank you for your help!
You should calculate the level of each step:
with recursive starting as (
select processid, stepid, parentid, 0 as level
from process
where parentid is null
union all
select p.processid, p.stepid, p.parentid, level+ 1
from starting s
join process p on s.stepid = p.parentid and s.processid = p.processid
)
select *
from starting
order by processid, level
processid | stepid | parentid | level
-----------+--------+----------+-------
1 | 1 | | 0
1 | 5 | 1 | 1
1 | 3 | 5 | 2
1 | 4 | 3 | 3
1 | 2 | 4 | 4
2 | 1 | | 0
2 | 5 | 1 | 1
2 | 3 | 5 | 2
2 | 4 | 3 | 3
2 | 2 | 4 | 4
(10 rows)
Of course, you can skip the last column in the final select if you do not need it.

Select rows that satisfy a certain group condition in psql

Given the following table:
id | value
---+---------
1 | 1
1 | 0
1 | 3
2 | 1
2 | 3
2 | 5
3 | 2
3 | 1
3 | 0
3 | 1
I want the following table:
id | value
---+---------
1 | 1
1 | 0
1 | 3
3 | 2
3 | 1
3 | 0
3 | 1
The table contains ids that have a minimum value of 0.
I have tried using exist and having but to no success.
try this :
select * from foo where id in (SELECT id FROM foo GROUP BY id HAVING MIN(value) = 0)
or that ( with window functions)
select * from
(select *,min(value) over (PARTITION BY id) min_by_id from foo) a
where min_by_id=0
If I'm understanding correctly, it's a fairly simple having clause:
=# SELECT id, MIN(value), MAX(value) FROM foo GROUP BY id HAVING MIN(value) = 0;
id | min | max
----+-----+-----
1 | 0 | 3
3 | 0 | 2
(2 rows)
Did I miss something that is making it more complicated?
It looks it is not possible to use window function in WHERE or HAVING. Below is solution based on JOINs.
JOIN every row with all rows of the same id.
Filter based on second set.
Show result from first set.
The SQL looks like this.
SELECT a.*
FROM a_table AS a
INNER JOIN a_table AS value ON a.id = b.id
WHERE b.value = 0;