postgresql postgis : defining a new subzone index consistent with the old one - postgresql

Here is my problem: I had a polygon layer with an index which looks like this:
id, population
100, 26
200, 12
300, 45
...
I edited the polygon layer and divided some of the polygons into smaller polygons (approximately 3-7 subpolygons). I already took care of having my data splitted between subzones (according to population density). So now I have this:
id, population
100, 22
100, 1
100, 3
200, 6
200, 6
I would like to create a new index that reflects the old one. For instance:
oldId, newId, population
100, 100, 22
100, 101, 1
100, 102, 3
200, 200, 6
200, 201, 6
Things I tried:
Defining a sequence:
DROP SEQUENCE IF EXISTS increment_id;
CREATE TEMP SEQUENCE increment_id INCREMENT BY 1 MINVALUE 0;
SELECT
id,
id+nextval('increment_id') AS new_id
FROM polygon_mapping WHERE id = 100;
THis works well for a single id to rename (the WHERE clause), but I don't know how to restart the sequence for every id.
I made some thinking around using the 'lag' function to compare current id with previous id. But I don't manage make it work.
Any suggestions?
Thank you
ps: I went through
Reset auto increment counter in postgres
where they reset the SEQUENCE but I don't manage to make it work in a SELECT clause.

Maybe using generate_series()?
SELECT id, generate_series(id,id + count(*)) AS newid
FROM polygon_mapping GROUP BY id;
If you want to select additional attributes, use a subquery and group the attributes using array_agg, than select the values from the array in the primary query:
SELECT id,
generate - 1 + id AS newid,
population_array[generate]
FROM (
SELECT id,
generate_series(1,count(*)) AS generate,
array_agg(population) AS population_array
FROM polygon_mapping GROUP BY id
) AS foo ORDER BY newid,id;

Related

update several columns of a table at once

In postgresql, I want to update several rows of a table according to their id. This doesn't work:
UPDATE table SET othertable_id = 5 WHERE id = 2, 45, 22, 75
What is the correct syntax for this?
Use an IN operator:
update the_table
set othertable_id = 5
where id in (2,45,22,75);

ST_ClusterDBSCAN function and minpoints parameter definition

I've spent the last 2 days trying to figure out what's wrong with my clustering query. It did seem to work correctly but after some more detailed testing, I was confused seeing that some clusters haven't got created even if they evidently should be.
So initially I was assuming that:
ST_ClusterDBSCAN(geom, eps := 1, minpoints := 4) OVER(PARTITION BY CONCAT(country_code, elevation_ft, height_ft, obstacle_type))
would result in clustering points that have the same "PARTITION BY" attributes and the minimum group would need to have 4 points (including core point) in eps distance. According to the docs:
https://postgis.net/docs/ST_ClusterDBSCAN.html
An input geometry will be added to a cluster if it is either:
A "core" geometry, that is within eps distance of at least minpoints input geometries (including itself) or
A "border" geometry, that is within eps distance of a core geometry. its surrounding area with radius eps.
But it seems that it's not exactly true. It seems that in order to cluster 4 points (like minpoints parameter is set), the grouping query:
OVER(PARTITION BY CONCAT(country_code, elevation_ft, height_ft, obstacle_type))
needs to result in at least five objects to let clst_id to be created for other four.. Here is an example:
CREATE TABLE IF NOT EXISTS public.point_table
(
point_sys_id serial primary key,--System generated Primary Key - Asset Cache
point_id bigint,
geom geometry(Point,4326),--Geometry Field
country_code varchar(4),--Country Code
elevation_ft numeric(7,2),--Elevation in Feet
height_ft numeric(7,2),--Height in Feet
obstacle_type varchar(50)--Obstacle Type
)
INSERT INTO point_table(point_id, geom, country_code, elevation_ft, height_ft, obstacle_type)
VALUES
(1, '0101000020E6100000E4141DC9E5934B40D235936FB6193940', 'ARE', 100, 50, 'BUILDING'),
(2, '0101000020E6100000C746205ED7934B40191C25AFCE193940', 'ARE', 100, 50, 'BUILDING'),
(3, '0101000020E6100000C780ECF5EE934B40B6BE4868CB193940', 'ARE', 100, 50, 'BUILDING'),
(4, '0101000020E6100000A97A358FA5AF4B4074A0C65B724C3940', 'ARE', 100, 50, 'BUILDING'), -- this point is outside of the cluster distance (eps)
(5, '0101000020E6100000ABB2EF8AE0934B404451A04FE4193940', 'ARE', 100, 50, 'BUILDING')
select ST_ClusterDBSCAN(geom, eps := 0.000906495804256269, minpoints := 4) OVER(PARTITION BY CONCAT(country_code, elevation_ft, height_ft, obstacle_type)) as clst_id,
point_id, geom, country_code, elevation_ft, height_ft, obstacle_type
from point_table
--where point_id != 4
Running clustering query agains all five points works fine. But once you exclude seemingly irrelevant point_id = 4 (which is outside of eps distance anyway) the clustering stops working (clst_id becomes null), even if still theoretically 4 needed points (according to the docs) are in place.
Once I change the minpoints parameter to 3, clustering works fine for those 4 neighbor points.
Can someone confirm my conclusion that ST_ClusterDBSCAN is not correct or give some good explanation for this behavior?
EDIT:
I've submitted a ticket to PostGIS directly: https://trac.osgeo.org/postgis/ticket/4853
and it seems that this has been fixed from 3.1 version :)

How to create multiple rows from a single row in Redshift SQL

I want to expand a single row to multiple rows in my table based on a column in the table in AWS Redshift.
Here is my example table schema and rows:
CREATE TABLE test (
start timestamp, -- start time of the first slot
slot_length int, -- the length of the slots in minutes
repeat int -- how many slots will be there
);
INSERT INTO test (start, slot_length, repeat) VALUES
('2019-09-22T00:00:00', 90, 2),
('2019-09-21T15:30:00', 60, 3);
I want to expand these two rows into 5 based on the value of the "repeat" column. So any row will be expanded "repeat" times. The first expansion won't change anything. The subsequent expansions need to add "slot_length" to the "start" column. Here is the final list of rows I want to have in the end:
'2019-09-22 00:00:00', 90, 2 -- expanded from the first row
'2019-09-22 01:30:00', 90, 2 -- expanded from the first row
'2019-09-21 15:30:00', 60, 3 -- expanded from the second row
'2019-09-21 16:30:00', 60, 3 -- expanded from the second row
'2019-09-21 17:30:00', 60, 3 -- expanded from the second row
Can this be done via pure SQL in Redshift?
This SQL should solves your purpose. Kindly Up-vote if it does.
select t.start
, case when rpt.repeat>1
then dateadd(min,t.slot_length*(rpt.repeat-1),t.start)
else t.start
end as new_start
, t.slot_length
, t.repeat
from schema.test t
join (select row_number() over() as repeat from schema.random_table) rpt
on t.repeat>=rpt.repeat
order by t.slot_length desc,rpt.repeat;
Please note that the "random_table" in your schema should have at least as many rows as the maximum value in your "repeat" column.

Constraint on sum from rows

I've got a table in PostgreSQL 9.4:
user_votes (
user_id int,
portfolio_id int,
car_id int
vote int
)
Is it possible to put a constraint on the table so a user max can have 99 point to vote with in each portfolio?
This means that a user can have multiple rows consisting of the same user_id and portfolio_id, but different car_id and vote. The sum on votes should never exceed 99, but it can be placed among different cars.
So doing:
INSERT INTO user_vores (user_id, portfolio_id, car_id, vote) VALUES
(1, 1, 1, 20),
(1, 1, 7, 40),
(1, 1, 9, 25)
would all be allowed, but when trying to add something that exceeds 99 votes should fail, like another row:
INSERT INTO user_vores (user_id, portfolio_id, car_id, vote) VALUES
(1, 1, 21, 40)
Unfortunately no, if you tried to create such a constraint you will see this error message:
ERROR: aggregate functions are not allowed in check constraints
But the wonderfull thing about postgresql is that there is always more than one way to skin a cat. You can use a BEFORE trigger to check that the data you are trying to insert fullfills our requirements.
Row-level triggers fired BEFORE can return null to signal the trigger
manager to skip the rest of the operation for this row (i.e.,
subsequent triggers are not fired, and the INSERT/UPDATE/DELETE does
not occur for this row). If a nonnull value is returned then the
operation proceeds with that row value.
Inside your trigger you would count the number of votes
SELECT COUNT(*) into vote_count FROM user_votes WHERE user_id = NEW.user_id
Now if vote_count is 99 you return NULL and the data will not be inserted.

Precedence/weight to a column using FREETEXTTABLE in dymnamic TSQL

I have dynamic sql that perform paging and a full text search using CONTAINSTABLE which works fine. Problem is I would like to use FREETEXTTABLE but weight the rank of some colums over others
Here is my orginal sql and the ranking weight I would like to integrate
(I have changed names for privacy reasons)
SELECT * FROM
(SELECT TOP 10 Things.ID, ROW_NUMBER()
OVER(ORDER BY KEY_TBL.RANK DESC ) AS Row FROM [Things]
INNER JOIN
CONTAINSTABLE([Things],(Features,Description,Address),
'ISABOUT("cow" weight (.9), "cow" weight(.1))') AS KEY_TBL
ON [Properties].ID = KEY_TBL.[KEY]
WHERE TypeID IN (91, 48, 49, 50, 51, 52, 53)
AND
dbo.FN_CalcDistanceBetweenLocations(51.89249, -8.493376,
Latitude, Longitude) <= 2.5
ORDER BY KEY_TBL.RANK DESC ) x
WHERE x.Row BETWEEN 1 AND 10
Here is what I would like to integrate
select sum(rnk) as weightRankfrom
From
(select
Rank * 2.0 as rnk,
[key]
from freetexttable(Things,Address,'cow')
union all
select
Rank * 1.0 as rnk,
[key]
from freetexttable(Things,(Description,Features),'cow')) as t
group by [key]
order by weightRankfrom desc
Unfortunately, the algorithm used by the freetext engine (FREETEXTTABLE) has no way to specify the significance of the various input columns. If this is critical, you may need to consider using a different product for your freetext needs.
You can create a column with the concatenation of:
Less_important_field &
More_important_field & More_important_field (2x)
This might look really stupid, but it's actually what BM25F does to simulate structured documents. The only downside of this hack-implementation is that you can't actually dynamically change the weight. It bloats up the table a bit, but not necessarily the index, which should only need counts.