How to check 3D coordinates against 3D bounding box in PostGIS? - postgresql

I would have imagined the obvious query was:
postgres=# SELECT ST_GeomFromText( 'POINT( 1 2 3 )' ) &&&
'BOX3D( -5 -5 -5, 5 5 5 )'::box3d;
But this results in
?column?
----------
f
As opposed to t.
The query seems to lose the z-coordinate from the bounding box completely. This also results in the following issue where a bounding box ranging from z=1 to z=2 will return t for a point at z=0:
galaxymap=# SELECT ST_GeomFromText( 'POINT( 0 0 0 )' ) &&&
'BOX3D( -1 -1 1, 1 1 2 )'::box3d;
?column?
----------
t
(1 row)

After an hour of googling I finally happened upon an e-mail conversation on the postgis-devel mailing list.
Our boxes are all broken.
There should be somewhere a wiki page or ticker or something about
options to improve the situation.
The suggested workaround seems to be using lines (or bounding diagonals, which I didn't try):
SELECT ST_MakePoint( 1, 2, 3 ) &&& ST_MakeLine(
ST_MakePoint( -10, -10, -10 ), ST_MakePoint( 10, 10, 10 ) );
?column?
----------
t

Related

PostGIS make buffer on LINESTRING Z to have a POLYGON Z

I have several LINESTRING Z geometies in PostgreSQL and they look like
LINESTRING Z (1 2 1,1 1 4)
I want to make a buffer around this linestring so that i can have a POLYGON Z geometry for further export to dxf.
I tried this
select st_astext(st_buffer('LINESTRING Z (1 2 1,1 1 4)'::geometry, 2)) as geom;
and it gives me
POLYGON((3 1,2.96157056080646 0.609819355967741,2.84775906502257 0.23463313526
9818,2.66293922460509 -0.111140466039206,2.41421356237309 -0.414213562373096,2.
1111404660392 -0.662939224605091,1.76536686473018 -0.847759065022574,1.39018064
403226 -0.961570560806461,1 -1,0.609819355967745 -0.961570560806461,0.234633135
269822 -0.847759065022574,-0.111140466039202 -0.662939224605092,-0.414213562373
094 -0.414213562373096,-0.662939224605089 -0.111140466039207,-0.847759065022572
0.234633135269818,-0.96157056080646 0.609819355967739,-1 1,-1 2,-0.96157056080
646 2.39018064403226,-0.847759065022572 2.76536686473018,-0.662939224605089 3.1
1114046603921,-0.414213562373094 3.4142135623731,-0.111140466039203 3.662939224
60509,0.234633135269821 3.84775906502257,0.609819355967744 3.96157056080646,1 4
,1.39018064403226 3.96157056080646,1.76536686473018 3.84775906502257,2.11114046
60392 3.66293922460509,2.41421356237309 3.4142135623731,2.66293922460509 3.1111
4046603921,2.84775906502257 2.76536686473018,2.96157056080646 2.39018064403226,
3 2,3 1)) (1 row)
which is in 2D POLYGON not POLYGON Z
How can I make it 3D?
I'm not totally sure what you want to achieve, but did you take a look at ST_Force3D?
SELECT
ST_AsText(
ST_Force3D(
ST_Buffer('LINESTRING Z (1 2 1,1 1 4)'::GEOMETRY, 2)));
It will return a POLYGON Z geometry:
POLYGON Z ((3 1 0,2.96157056080646 0.609819355967741 0,2.84775906502257 0.234633135269818 0,2.66293922460509 -0.111140466039206 0,2.41421356237309 -0.414213562373096 0,2.1111404660392 -0.662939224605091 0,1.76536686473018 -0.847759065022574 0,1.39018064403226 -0.961570560806461 0,1 -1 0,0.609819355967745 -0.961570560806461 0,0.234633135269822 -0.847759065022574 0,-0.111140466039202 -0.662939224605092 0,-0.414213562373094 -0.414213562373096 0,-0.662939224605089 -0.111140466039207 0,-0.847759065022572 0.234633135269818 0,-0.96157056080646 0.609819355967739 0,-1 1 0,-1 2 0,-0.96157056080646 2.39018064403226 0,-0.847759065022572 2.76536686473018 0,-0.662939224605089 3.11114046603921 0,-0.414213562373094 3.4142135623731 0,-0.111140466039203 3.66293922460509 0,0.234633135269821 3.84775906502257 0,0.609819355967744 3.96157056080646 0,1 4 0,1.39018064403226 3.96157056080646 0,1.76536686473018 3.84775906502257 0,2.1111404660392 3.66293922460509 0,2.41421356237309 3.4142135623731 0,2.66293922460509 3.11114046603921 0,2.84775906502257 2.76536686473018 0,2.96157056080646 2.39018064403226 0,3 2 0,3 1 0))
The function ST_Buffer discards the Z dimension, as stated in the documentation:
... This function ignores the third dimension (z) and will always give a
2-d buffer even when presented with a 3d-geometry.
EDIT:
This query sort of creates a buffer with the average Z value of a given LINESTRING Z.
WITH j AS (
SELECT
ST_DumpPoints(
ST_Buffer('LINESTRING Z (1 2 1,1 1 4)'::GEOMETRY, 2)
) AS pt,
(SELECT AVG(z) AS avg_z
FROM (SELECT ST_Z((ST_DumpPoints('LINESTRING Z (1 2 1,1 1 4)'::GEOMETRY)).geom) AS z) AS z) AS lsz
)
SELECT ST_AsText(
ST_MakePolygon(ST_MakeLine(ST_MakePoint(ST_X((pt).geom),ST_Y((pt).geom),lsz))))
FROM j
GROUP BY lsz;
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
POLYGON Z ((3 1 2.5,2.96157056080646 0.609819355967741 2.5,2.84775906502257 0.234633135269818 2.5,2.66293922460509 -0.111140466039206 2.5,2.41421356237309 -0.414213562373096 2.5,2.1111404660392 -0.662939224605091 2.5,1.76536686473018 -0.847759065022574 2.5,1.39018064403226 -0.961570560806461 2.5,1 -1 2.5,0.609819355967745 -0.961570560806461 2.5,0.234633135269822 -0.847759065022574 2.5,-0.111140466039202 -0.662939224605092 2.5,-0.414213562373094 -0.414213562373096 2.5,-0.662939224605089 -0.111140466039207 2.5,-0.847759065022572 0.234633135269818 2.5,-0.96157056080646 0.609819355967739 2.5,-1 1 2.5,-1 2 2.5,-0.96157056080646 2.39018064403226 2.5,-0.847759065022572 2.76536686473018 2.5,-0.662939224605089 3.11114046603921 2.5,-0.414213562373094 3.4142135623731 2.5,-0.111140466039203 3.66293922460509 2.5,0.234633135269821 3.84775906502257 2.5,0.609819355967744 3.96157056080646 2.5,1 4 2.5,1.39018064403226 3.96157056080646 2.5,1.76536686473018 3.84775906502257 2.5,2.1111404660392 3.66293922460509 2.5,2.41421356237309 3.4142135623731 2.5,2.66293922460509 3.11114046603921 2.5,2.84775906502257 2.76536686473018 2.5,2.96157056080646 2.39018064403226 2.5,3 2 2.5,3 1 2.5))
(1 row)

Querying polygons that contain 4 points

I have 4 points that I always get, I would like to query if the polygon defined by a multipoint contains those 4 points. I’m using PostGIS and Postgres.
I'm also using OGR/GDAL for that purpose. Would someone provide me with the Query using SQL for that purpose.
This checks if the points (1 1), (2 2), (3 3), and (4 4) all lie inside the polygon defined by (0 0), (10 0), (10 10), (0 10) and (0 0):
SELECT st_contains(
st_polygon(
st_linefrommultipoint(
st_mpointfromtext(
'MULTIPOINT(0 0, 10 0, 10 10, 0 10, 0 0)'
)
),
0
),
st_mpointfromtext(
'MULTIPOINT(1 1, 2 2, 3 3, 4 4)'
)
);
So to find all multipoints that satisfy the criterion, you could use something like that:
SELECT id
FROM multipoints
WHERE st_contains(
st_polygon(
st_addpoint(
st_linefrommultipoint(
multipoints.geom
),
st_startpoint(
st_linefrommultipoint(
multipoints.geom
)
),
-1
),
st_srid(multipoints.geom)
),
st_mpointfromtext(
'MULTIPOINT(1 1, 2 2, 3 3, 4 4)',
8307
)
);
This assumes that the multipoints don't form a closed polygon (i.e., first point is equal to last).
I used SRID 8307 in my example, replace it with the one you need.

RankingMetrics in Spark (Scala)

I am trying to use spark RankingMetrics.meanAveragePrecision.
However it seems like its not working as expected.
val t2 = (Array(0,0,0,0,1), Array(1,1,1,1,1))
val r = sc.parallelize(Seq(t2))
val rm = new RankingMetrics[Int](r)
rm.meanAveragePrecision // Double = 0.2
rm.precisionAt(5) // Double = 0.2
t2 is a tuple where the left array indicates the real values and the right array the predicted values (1 - relevant document, 0- non relevant)
If we calculate the average precision for t2 we get :
(0/1 + 0/2 + 0/3 + 0/4 + 1/5 )/5 = 1/25
But the RankingMetric returns 0.2 for MeanAveragePrecision which should be 1/25.
Thanks.
I think that the problem is your input data. Since your predicted/actual data contains relevance scores, I think you should be looking at binary classification metrics rather than ranking metrics if you want to evaluate using the 0/1 scores.
RankingMetrics is expecting two lists/arrays of ranked items instead, so if you replace the scores with the document ids it should work as expected. Here is an example in PySpark, with two lists that only match on the 5th item:
from pyspark.mllib.evaluation import RankingMetrics
rdd = sc.parallelize([(['a','b','c','d','z'], ['e','f','g','h','z'])])
metrics = RankingMetrics(rdd)
for i in range(1, 6):
print i, metrics.precisionAt(i)
print 'meanAveragePrecision', metrics.meanAveragePrecision
print 'Mean precisionAt', sum([0, 0, 0, 0, 0.2]) / 5
Which produced:
1 0.0
2 0.0
3 0.0
4 0.0
5 0.2
meanAveragePrecision 0.04
Mean precisionAt 0.04
Basically how the RankingMetrics function works is with two lists on each row,
First list is the items being recommended order matters here
Second list is the relevant items
For example in PySpark (But should be equivalent for Scala or Java),
recs_rdd = sc.parallelize([
(
['item1', 'item2', 'item3'], # Recommendations in order
['item3', 'item2'] # Relevant items - Unordered
),
(
['item3', 'item1', 'item2'], # Recommendations in order
['item3', 'item2'] # Relevant items - Unordered
),
])
from pyspark.mllib.evaluation import RankingMetrics
rankingMetrics = RankingMetrics(recs_rdd)
print("MAP: ", rankingMetrics.meanAveragePrecision)
This prints the MAP value of 0.7083333333333333 and is calculated by
(
(1/2 + 2/3) / 2
+ (1/1 + 2/3) / 2
) / 2
Which equals 0.708333
With
row 1 as (1/2 + 2/3) / 2
1/2 : 1 item in positions 2 or less are relevant
2/3 : 2 items in positions 3 or less are relevant
2 : Row 1 has 2 relevant items
row 2 as (1/1 + 2/3) / 2
1/1 : 1 item in position 1 or less is relevant
2/3 : 2 items in positions 3 or less are relevant
2 : Row 2 has 2 relevant items
And / 2 as there are 2 rows

PostGIS: intersections of set of collinear line segments, with counts

I have a set of collinear line segments (may be mutually disjoint, contained, or overlapping).
I want to make a new set of line segments where the segments are disjoint or touching (not overlapping), and each line segment has a count of the original line segments that cover it.
For example, suppose the original set is (drawn non-collinearly for illustration):
A----------------------B
C---------------------------D
E-----F
G-------------H
I-------J
the desired new set would be:
A-------C---E-----F-----B-----------D G-------------H-------J
1 2 3 2 1 1 1
(only the point coordinates matter, the new set does not share point objects with the old set)
How can I achieve this with PostGIS?
Related question: suppose I start with a table of line segments, not all collinear, how do I write the entire query that groups the collinear segments together and then applies the solution to my first question?
Thanks for any help!
Setup (for later queries):
create table lines (
id serial primary key,
label text not null,
line_data geometry(linestring) not null
);
insert into lines(label, line_data)
values ('A-B', ST_MakeLine(ST_MakePoint(-3, -6), ST_MakePoint( 1, 2))),
('D-C', ST_MakeLine(ST_MakePoint( 2, 4), ST_MakePoint(-2, -4))),
('E-F', ST_MakeLine(ST_MakePoint(-1, -2), ST_MakePoint( 0, 0))),
('G-H', ST_MakeLine(ST_MakePoint( 3, 6), ST_MakePoint( 4, 8))),
('I-J', ST_MakeLine(ST_MakePoint( 4, 8), ST_MakePoint( 5, 10))),
('P-L', ST_MakeLine(ST_MakePoint( 1, 0), ST_MakePoint( 2, 2))),
('X-Y', ST_MakeLine(ST_MakePoint( 2, 2), ST_MakePoint( 0, 4)));
Notes:
I purposely switched your D and C points to demonstrate a need for vector negation
The P-L line is parallel with your example lines (but not collinear)
The X-Y line has nothing to do with the others
the solutions below obviously won't work, when you have linestrings that have more than 2 points and those are not on the same line (so when a single linestring is not straight).
The ST_Union aggregate function can split your collinear linestrings. You'll just need to calculate how many lines are containing those.
However, grouping by collinearity is not that simple. I did not find any out-of-the-box solution for this, but you can calculate it (this will not calculate counts yet):
select string_agg(label, ','), ST_AsText(ST_Multi(ST_Union(line_data)))
from lines
group by (
select case
when ST_SRID(s) <> ST_SRID(e) then row(ST_SRID(s), s, null)
when ST_X(s) = ST_X(e) then row(ST_SRID(s), ST_SetSRID(ST_MakePoint(ST_X(s), 1.0), ST_SRID(s)), null)
when ST_Y(s) = ST_Y(e) then row(ST_SRID(s), ST_SetSRID(ST_MakePoint(1.0, ST_Y(e)), ST_SRID(s)), null)
else (
select row(
ST_SRID(s),
(select case
when ST_Y(rv) < 0
then ST_SetSRID(ST_MakePoint(-ST_X(rv), -ST_Y(rv)), ST_SRID(s))
else rv
end), -- normalized vector (negated when necessary, but same for all parallel lines)
(ST_X(e) * ST_Y(s) - ST_X(s) * ST_Y(e)) / (ST_X(e) - ST_X(s)) -- solution of the linear equaltion, where x=0
)
from coalesce(1.0 / nullif(ST_Distance(s, e), 0), 0) dmi, -- distance's multiplicative inverse
ST_TransScale(e, -ST_X(s), -ST_Y(s), dmi, dmi) rv -- raw vector (translated and scaled)
)
end
from ST_StartPoint(line_data) s,
ST_EndPoint(line_data) e
)
will produce:
X-Y | MULTILINESTRING((2 2,0 4))
P-L | MULTILINESTRING((1 0,2 2))
E-F,A-B,I-J,G-H,D-C | MULTILINESTRING((-3 -6,-2 -4),(-2 -4,-1 -2),(-1 -2,0 0),(0 0,1 2),(2 4,1 2),(3 6,4 8),(4 8,5 10))
To calculate counts, JOIN your original data again, where the splitted lines are contained by (ST_Contains) your original lines:
select ST_AsText(splitted_line), count(line_data)
from (select ST_Multi(ST_Union(line_data)) ml
from lines
group by (
select case
when ST_SRID(s) <> ST_SRID(e) then row(ST_SRID(s), s, null)
when ST_X(s) = ST_X(e) then row(ST_SRID(s), ST_SetSRID(ST_MakePoint(ST_X(s), 1.0), ST_SRID(s)), null)
when ST_Y(s) = ST_Y(e) then row(ST_SRID(s), ST_SetSRID(ST_MakePoint(1.0, ST_Y(e)), ST_SRID(s)), null)
else (
select row(
ST_SRID(s),
(select case
when ST_Y(rv) < 0
then ST_SetSRID(ST_MakePoint(-ST_X(rv), -ST_Y(rv)), ST_SRID(s))
else rv
end), -- normalized vector (negated when necessary, but same for all parallel lines)
(ST_X(e) * ST_Y(s) - ST_X(s) * ST_Y(e)) / (ST_X(e) - ST_X(s)) -- solution of the linear equaltion, where x=0
)
from coalesce(1.0 / nullif(ST_Distance(s, e), 0), 0) dmi, -- distance's multiplicative inverse
ST_TransScale(e, -ST_X(s), -ST_Y(s), dmi, dmi) rv -- raw vector (translated and scaled)
)
end
from ST_StartPoint(line_data) s,
ST_EndPoint(line_data) e)) al,
generate_series(1, ST_NumGeometries(ml)) i,
ST_GeometryN(ml, i) splitted_line
left join lines on ST_Contains(line_data, splitted_line)
group by splitted_line
will return:
LINESTRING(-3 -6,-2 -4) | 1
LINESTRING(-2 -4,-1 -2) | 2
LINESTRING(-1 -2,0 0) | 3
LINESTRING(0 0,1 2) | 2
LINESTRING(2 2,0 4) | 1
LINESTRING(1 0,2 2) | 1
LINESTRING(2 4,1 2) | 1
LINESTRING(3 6,4 8) | 1
LINESTRING(4 8,5 10) | 1

Facebook interview: find out the order that gives max sum by selecting boxes with number in a ring, when the two next to it is destroyed

Didn't find any similar question about this.
This is a final round Facebook question:
You are given a ring of boxes. Each box has a non-negative number on it, can be duplicate.
Write a function/algorithm that will tell you the order at which you select the boxes, that will give you the max sum.
The catch is, if you select a box, it is taken off the ring, and so are the two boxes next to it (to the right and the left of the one you selected).
so if I have a ring of
{10 3 8 12}
If I select 12, 8 and 10 will be destroyed and you are left with 3.
The max will be selecting 8 first then 10, or 10 first then 8.
I tried re-assign the boxes their value by take its own value and then subtracts the two next to is as the cost.
So the old ring is {10 3 8 12}
the new ring is {-5, -15, -7, -6}, and I will pick the highest.
However, this definitely doesn't work if you have { 10, 19, 10, 0}, you should take the two 10s, but the algorithm will take the 19 and 0.
Help please?
It is most likely dynamic programming, but I don't know how.
The ring can be any size.
Here's some python that solves the problem:
def sublist(i,l):
if i == 0:
return l[2:-1]
elif i == len(l)-1:
return l[1:-2]
else:
return l[0:i-1] + l[i+2:]
def val(l):
if len(l) <= 3:
return max(l)
else:
return max([v+val(m) for v,m in [(l[u],sublist(u,l)) for u in range(len(l))]])
def print_indices(l):
print("Total:",val(l))
while l:
vals = [v+val(m) for v,m in [(l[u],sublist(u,l)) for u in range(len(l)) if sublist(u,l)]]
if vals:
i = vals.index(max(vals))
else:
i = l.index(max(l))
print('choice:',l[i],'index:',i,'new list:',sublist(i,l))
l = sublist(i,l)
print_indices([10,3,8,12])
print_indices([10,19,10,0])
Output:
Total: 18
choice: 10 index: 0 new list: [8]
choice: 8 index: 0 new list: []
Total: 20
choice: 10 index: 0 new list: [10]
choice: 10 index: 0 new list: []
No doubt it could be optimized a bit. The key bit is val(), which calculates the total value of a given ring. The rest is just bookkeeping.