Postgres spatial SQL queries - postgresql

I'm trying to wrap my head about how postgres spatial stuff works from a SQL point of view. My goal is to be able to insert polygon geometry references as a column onto a table that also includes other information -- geometry name, et cetera.
I've started out by importing shapefiles into a geometry enabled postgres database. The polygon tables have been created off on a different schema; we'll use polygonGeometry as a example. So, public.geometry_columns has references off to polygonGeometry.(table) for each of my inserted polygons.
I then want to create another table, that has an id (serial primary key), a name (character variable), and a reference to the geometry (either polygon, or a refernece to a different key) that I inserted. How do I go about setting this schema up?
I then have another table with a id (serial primary key), lat (real) and long (real). What SQL query would I run to select geometries from the first table by primary key id, combine them with ST_Union, and return points from the second table created with ST_GeomFromText with the lat and long columns for each row that are within the unioned polygon?
Additionally, does anyone know any good references for getting up to speed with the spatial stuff from a schema, design, and usage standpoint assuming comfortable familiarity with SQL?

I think if you take a look at https://gis.stackexchange.com/ you will find answers to most (if not all) of your questions asked. Search under the postgis tag.
For resources, I really liked the book "PostGIS in Action".
You can get links and learn more at How do I get started with PostGis? and Spatial databases learning resources for newbies.

Related

How can I add SRID 4326 (Spatial Types) to Workbench when adding columns?

When I add a column with type POINT in the EER Diagram, is there anything I can do with that diagram so when I generate automatically the scripts, SRID 4326 is attached to CREATE TABLE script? If I don't setup that number, then by default is zero (flat), but I do need 4326 (sphere).
If not possible, does that mean I cannot synchronise my model with my server automatically and I have to add these changes manually all the time?
I couldn't figure this out either. I believe adding a SRID to a column is currently not supported by MySQL Workbench.
To check that it is indeed not supported, I did the following:
Added a SRID to a column of an existing DB
Reversed engineered a script of this DB (using Workbench)
Checked the script if it included the set SRID for the column
Was disappointed that it didn't...
The "good" news is though, that as it is not supported, MySQL Workbench won't pick up on a missing SRID on a column when synchronizing sources.
This means that once you set the SRID on a column yourself, it won't cause any problems when synchronizing in the futures.
Note that in order to set a SRID on a column, there can't be a (spatial) index on that column. Therefore, you must remove the index, set the SRID and then add the index back.
Below is a short and simple script, which I used to do this. Don't forget to update it to your use-case:
DROP INDEX `my_idx` ON my_table;
ALTER TABLE my_table MODIFY COLUMN my_column POINT NOT NULL SRID 4326;
ALTER TABLE my_table ADD SPATIAL INDEX `my_idx` (`my_column`) VISIBLE;

Inserting into postgres database

I am working trying to write an insert query into a backup database. I writing place and entities tables into this database. The issue is entities is linked to place via place.id column. I added a column place.original_id in the place table to store it's original 'id'. so now that i entered place into the new database it's id column changed but i have the original id stored so I can still link entities table to it. I am trying to figure out how to write entities to get the new id
so far i am at this point:
insert into entities_backup (id, place_id)
select
nextval('public.entities_backup_id_seq'),
(select id from places where original_id = (select place_id from entities) as place_id
from
entities
I know I am missing something because this does not work. I need to grab the id column from places when entity.place_id = places.original_id. Any help would be great.
I think this is what you want
insert into entities_backup (id, place_id)
select nextval('public.entities_backup_id_seq'), places.id
from places, entities
where places.original_id = entities.place_id;
I am working trying to write an insert query into a backup database. I writing place and entities tables into this database. The issue is entities is linked to place via place.id column. I added a column place.original_id in the place table to store it's original 'id'. so now that i entered place into the new database it's id column changed but i have the original id stored so I can still link entities table to it.
It would be simpler to not have this problem in the first place.
Rather than trying to fix this up after the fact, the better solution is to dump and load places and entities complete with their primary and foreign keys intact. Oracle's EXPORT or a utility such as ora2pg should be able to do it.
Sorry I can't say more. I know Postgres, not Oracle.

How to add geometry column using pgAdmin

I'm using a database created in the PostgreSQL. In its schema there are two tables and in one of them I want to add a geometry column.
The problem is that I created the postgis Extension (CREATE EXTENSION postgis;) for the database, but I'm not able to add this data type (geometry) column using pgAdmin.
To do this with pgAdmin's "New Column..." dialog, if you can't find geometry, then you might be able to find public.geometry instead (if PostGIS was installed there, which is normal).
However, I advise against using pgAdmin for creating geometry columns, as it does not understand typmods used to define the geometry type and SRID.
The best way is using DDL to directly manipulate the table, e.g.:
ALTER TABLE locations ADD COLUMN geom geometry(PointZ,4326);
to add a geom column of XYZ points (long, lat, alt).

How to join Non-Spatial table to Spatial table in Postgis database (Both tables are in different database)

I need to join a Non-spatial table to Spatial table, but both are in different database.
Spatial_table(dbname:dist)
gid
district_name
district_code
geom
Non_Spatial(dbname:census)
ID
district_code
population
male_popu
female_popu
Can anyone please suggest me, How to relate the above to tables to get the query result for the population of specific district?
Also can anyone tell me about the difference between Joining and Relating of two tables.
You can't do a cross-database join in PostgreSQL. The way MySQL uses databases PostgreSQL uses schemas.
There is an add-on called dblink though which lets you query another PG database (even on another machine).
That's not going to be very efficient with a join though, because it's going to have to transfer a whole table in one direction or the other to do the comparisons. If you are going to regularly want to join tables they need to be in the same database (but perhaps separate schemas).

When to use inherited tables in PostgreSQL?

In which situations you should use inherited tables? I tried to use them very briefly and inheritance didn't seem like in OOP world.
I thought it worked like this:
Table users has all fields required for all user levels. Tables like moderators, admins, bloggers, etc but fields are not checked from parent. For example users has email field and inherited bloggers has it now too but it's not unique for both users and bloggers at the same time. ie. same as I add email field to both tables.
The only usage I could think of is fields that are usually used, like row_is_deleted, created_at, modified_at. Is this the only usage for inherited tables?
There are some major reasons for using table inheritance in postgres.
Let's say, we have some tables needed for statistics, which are created and filled each month:
statistics
- statistics_2010_04 (inherits statistics)
- statistics_2010_05 (inherits statistics)
In this sample, we have 2.000.000 rows in each table. Each table has a CHECK constraint to make sure only data for the matching month gets stored in it.
So what makes the inheritance a cool feature - why is it cool to split the data?
PERFORMANCE: When selecting data, we SELECT * FROM statistics WHERE date BETWEEN x and Y, and Postgres only uses the tables, where it makes sense. Eg. SELECT * FROM statistics WHERE date BETWEEN '2010-04-01' AND '2010-04-15' only scans the table statistics_2010_04, all other tables won't get touched - fast!
Index size: We have no big fat table with a big fat index on column date. We have small tables per month, with small indexes - faster reads.
Maintenance: We can run vacuum full, reindex, cluster on each month table without locking all other data
For the correct use of table inheritance as a performance booster, look at the postgresql manual.
You need to set CHECK constraints on each table to tell the database, on which key your data gets split (partitioned).
I make heavy use of table inheritance, especially when it comes to storing log data grouped by month. Hint: If you store data, which will never change (log data), create or indexes with CREATE INDEX ON () WITH(fillfactor=100); This means no space for updates will be reserved in the index - index is smaller on disk.
UPDATE:
fillfactor default is 100, from http://www.postgresql.org/docs/9.1/static/sql-createtable.html:
The fillfactor for a table is a percentage between 10 and 100. 100 (complete packing) is the default
"Table inheritance" means something different than "class inheritance" and they serve different purposes.
Postgres is all about data definitions. Sometimes really complex data definitions. OOP (in the common Java-colored sense of things) is about subordinating behaviors to data definitions in a single atomic structure. The purpose and meaning of the word "inheritance" is significantly different here.
In OOP land I might define (being very loose with syntax and semantics here):
import life
class Animal(life.Autonomous):
metabolism = biofunc(alive=True)
def die(self):
self.metabolism = False
class Mammal(Animal):
hair_color = color(foo=bar)
def gray(self, mate):
self.hair_color = age_effect('hair', self.age)
class Human(Mammal):
alcoholic = vice_boolean(baz=balls)
The tables for this might look like:
CREATE TABLE animal
(name varchar(20) PRIMARY KEY,
metabolism boolean NOT NULL);
CREATE TABLE mammal
(hair_color varchar(20) REFERENCES hair_color(code) NOT NULL,
PRIMARY KEY (name))
INHERITS (animal);
CREATE TABLE human
(alcoholic boolean NOT NULL,
FOREIGN KEY (hair_color) REFERENCES hair_color(code),
PRIMARY KEY (name))
INHERITS (mammal);
But where are the behaviors? They don't fit anywhere. This is not the purpose of "objects" as they are discussed in the database world, because databases are concerned with data, not procedural code. You could write functions in the database to do calculations for you (often a very good idea, but not really something that fits this case) but functions are not the same thing as methods -- methods as understood in the form of OOP you are talking about are deliberately less flexible.
There is one more thing to point out about inheritance as a schematic device: As of Postgres 9.2 there is no way to reference a foreign key constraint across all of the partitions/table family members at once. You can write checks to do this or get around it another way, but its not a built-in feature (it comes down to issues with complex indexing, really, and nobody has written the bits necessary to make that automatic). Instead of using table inheritance for this purpose, often a better match in the database for object inheritance is to make schematic extensions to tables. Something like this:
CREATE TABLE animal
(name varchar(20) PRIMARY KEY,
ilk varchar(20) REFERENCES animal_ilk NOT NULL,
metabolism boolean NOT NULL);
CREATE TABLE mammal
(animal varchar(20) REFERENCES animal PRIMARY KEY,
ilk varchar(20) REFERENCES mammal_ilk NOT NULL,
hair_color varchar(20) REFERENCES hair_color(code) NOT NULL);
CREATE TABLE human
(mammal varchar(20) REFERENCES mammal PRIMARY KEY,
alcoholic boolean NOT NULL);
Now we have a canonical reference for the instance of the animal that we can reliably use as a foreign key reference, and we have an "ilk" column that references a table of xxx_ilk definitions which points to the "next" table of extended data (or indicates there is none if the ilk is the generic type itself). Writing table functions, views, etc. against this sort of schema is so easy that most ORM frameworks do exactly this sort of thing in the background when you resort to OOP-style class inheritance to create families of object types.
Inheritance can be used in an OOP paradigm as long as you do not need to create foreign keys on the parent table. By example, if you have an abstract class vehicle stored in a vehicle table and a table car that inherits from it, all cars will be visible in the vehicle table but a foreign key from a driver table on the vehicle table won't match theses records.
Inheritance can be also used as a partitionning tool. This is especially usefull when you have tables meant to be growing forever (log tables etc).
Main use of inheritance is for partitioning, but sometimes it's useful in other situations. In my database there are many tables differing only in a foreign key. My "abstract class" table "image" contains an "ID" (primary key for it must be in every table) and PostGIS 2.0 raster. Inherited tables such as "site_map" or "artifact_drawing" have a foreign key column ("site_name" text column for "site_map", "artifact_id" integer column for the "artifact_drawing" table etc.) and primary and foreign key constraints; the rest is inherited from the the "image" table. I suspect I might have to add a "description" column to all the image tables in the future, so this might save me quite a lot of work without making real issues (well, the database might run little slower).
EDIT: another good use: with two-table handling of unregistered users, other RDBMSs have problems with handling the two tables, but in PostgreSQL it is easy - just add ONLY when you are not interrested in data in the inherited "unregistered user" table.
The only experience I have with inherited tables is in partitioning. It works fine, but it's not the most sophisticated and easy to use part of PostgreSQL.
Last week we were looking the same OOP issue, but we had too many problems with Hibernate - we didn't like our setup, so we didn't use inheritance in PostgreSQL.
I use inheritance when I have more than 1 on 1 relationships between tables.
Example: suppose you want to store object map locations with attributes x, y, rotation, scale.
Now suppose you have several different kinds of objects to display on the map and each object has its own map location parameters, and map parameters are never reused.
In these cases table inheritance would be quite useful to avoid having to maintain unnormalised tables or having to create location id’s and cross referencing it to other tables.
I tried some operations on it, I will not point out if is there any actual use case for database inheritance, but I will give you some detail for making your decision. Here is an example of PostgresQL: https://www.postgresql.org/docs/15/tutorial-inheritance.html
You can try below SQL script.
CREATE TABLE IF NOT EXISTS cities (
name text,
population real,
elevation int -- (in ft)
);
CREATE TABLE IF NOT EXISTS capitals (
state char(2) UNIQUE NOT NULL
) INHERITS (cities);
ALTER TABLE cities
ADD test_id varchar(255); -- Both table would contains test col
DROP TABLE cities; -- Cannot drop because capitals depends on it
ALTER TABLE cities
ADD CONSTRAINT fk_test FOREIGN KEY (test_id) REFERENCES sometable (id);
As you can see my comments, let me summarize:
When you add/delete/update fields -> the inheritance table would also be affected.
Cannot drop the parent table.
Foreign keys would not be inherited.
From my perspective, in growing applications, we cannot easily predict the changes in the future, for me I would avoid applying this to early database developing.
When features are stable as well and we want to create some database model which much likely the same as the existing one, we can consider that use case.
Use it as little as possible. And that usually means never, it boiling down to a way of creating structures that violate the relational model, for instance by breaking the information principle and by creating bags instead of relations.
Instead, use table partitioning combined with proper relational modelling, including further normal forms.