Can not store a double value for spatial datatype in Mysql5.6 - mysql-5.6

I am trying to insert a polygon shape in a mysql database. Vertices of Polygon shapes consist of double values. To insert the value I have tried with the following query. But I got the following as my error.
INSERT INTO HIBERNATE_SPATIAL
(PRD_GEO_REGION_ID,OWNER_ID,GEO_REGION_NAME,GEO_REGION_DESCRIPTION,GEO_REGION_DEFINITION)
VALUES (8,1,'POLYGON8','SHAPE8',Polygon(LineString(10.12345612341243,11.12345612341234),LineString(10.34512341246,11.4123423456),LineString(10.31423424456,11.34123423456),LineString(10.341234256,11.3412342456),LineString(10.11423423456,11.123424)));
TABLE DESCRIPTION
+------------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------------+--------------+------+-----+---------+----------------+
| PRD_GEO_REGION_ID | int(11) | NO | PRI | NULL | auto_increment |
| OWNER_ID | decimal(3,0) | NO | | NULL | |
| GEO_REGION_NAME | varchar(50) | NO | UNI | NULL | |
| GEO_REGION_DESCRIPTION | varchar(70) | YES | | NULL | |
| GEO_REGION_DEFINITION | geometry | YES | | NULL | |
+------------------------+--------------+------+-----+---------+----------------+
THE ERROR :_:
ERROR 1367 (22007): Illegal non geometric '10.12345612341243' value found during parsing
Regards,
ArunRaj.

Solved the issue. Stored Decimal value successfully. I have done two mistakes.
1 ) LineString can only store Points datatype (Not the decimal values or coordinates). Corrected syntax follows.
INSERT INTO HIBERNATE_SPATIAL
(PRD_GEO_REGION_ID,OWNER_ID,GEO_REGION_NAME,GEO_REGION_DESCRIPTION,GEO_REGION_DEFINITION)
VALUES (8,1,'POLYGON8','SHAPE8',Polygon(LineString(POINT(10.12345612341243,11.12345612341234)),LineString(POINT(10.34512341246,11.4123423456)),LineString(POINT(10.31423424456,11.34123423456)),LineString(POINT(10.341234256,11.3412342456)),LineString(POINT(10.11423423456,11.123424))));
2 ) If it is a polygon shape. The shape has to be closed ( Starting and Ending point should be the same ).
That was the problem.
WORKING QUERY
INSERT INTO HIBERNATE_SPATIAL
(PRD_GEO_REGION_ID,OWNER_ID,GEO_REGION_NAME,GEO_REGION_DESCRIPTION,GEO_REGION_DEFINITION)
VALUES (8,1,'POLYGON8','SHAPE8',Polygon(LineString(POINT(10.12345612341243,11.12345612341234)),LineString(POINT(10.34512341246,11.4123423456)),LineString(POINT(10.31423424456,11.34123423456)),LineString(POINT(10.341234256,11.3412342456)),LineString(POINT(10.12345612341243,11.12345612341234))));

Related

Any way to find and delete almost similar records with SQL?

I have a table in Postgres DB, that has a lot of almost identical rows. For example:
1. 00Zicky_-_San_Pedro_Danilo_Vigorito_Remix
2. 00Zicky_-_San_Pedro__Danilo_Vigorito_Remix__
3. 0101_-_Try_To_Say__Strictlyjaz_Unit_Future_Rmx__
4. 0101_-_Try_To_Say__Strictlyjaz_Unit_Future_Rmx_
5. 01_-_Digital_Excitation_-_Brothers_Gonna_Work_it_Out__Piano_Mix__
6. 01_-_Digital_Excitation_-_Brothers_Gonna_Work_it_Out__Piano_Mix__
I think about to writing a little golang script to remove duplicates, but maybe SQL can do it?
Table definition:
\d+ songs
Table "public.songs"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
---------------+-----------------------------+-----------+----------+----------------------------------------+----------+-------------+--------------+-------------
song_id | integer | | not null | nextval('songs_song_id_seq'::regclass) | plain | | |
song_name | character varying(250) | | not null | | extended | | |
fingerprinted | smallint | | | 0 | plain | | |
file_sha1 | bytea | | | | extended | | |
total_hashes | integer | | not null | 0 | plain | | |
date_created | timestamp without time zone | | not null | now() | plain | | |
date_modified | timestamp without time zone | | not null | now() | plain | | |
Indexes:
"pk_songs_song_id" PRIMARY KEY, btree (song_id)
Referenced by:
TABLE "fingerprints" CONSTRAINT "fk_fingerprints_song_id" FOREIGN KEY (song_id) REFERENCES songs(song_id) ON DELETE CASCADE
Access method: heap
Tried several methods to find duplicates, but that methods search only for exact similarity.
There is no operator which is essentially A almost = B. (Well there is full text search, but that seems to be a little excessive here.) If the only difference is the number of - and _ then just get rid of them and compare the the resulting difference. If they are equal, then one is a duplicate. You can use the replace() function to remove them. So something like: (see demo)
delete
from songs s2
where exists ( select null
from songs s1
where s1.song_id < s2.song_id
and replace(replace(s1.name, '_',''),'-','') =
replace(replace(s2.name, '_',''),'-','')
);
If your table is large this will not be fast, but a functional index may help:
create index song_name_idx on songs
(replace(replace(name, '_',''),'-',''));

How to insert NULL values into a PostgreSQL table

I have data in CSV file I am trying to insert into a postgresSQL table using pgloader. The input file is from MS SQL server export, and NULL values are already explicitly cast as NULL.
My pgloader scripts seems to fail for the keywords NULL, noticeably for integer and timestamp fields.
I really don't know what I am missing. Your help will be much appreciated.
I can successfully insert into the table from psql console:
insert into raw.a2
(NUM , F_FILENO , F_FOLIONO , F_DOC_TYPE , F_DOCDATE , F_BATCH , F_BOX , F_BLUCPY , F_ROUTOPOST , F_ROUTOUSR , F_WFCREATE , LINKEDFILE , DATECREATE , USERCREATE , DATEUPDATE , USERUPDATE , MEDIA , PGCOUNT , GROUPNUM , SUBJECT , PRI , F_FILECAT)
values
(
16,'18',3,'Nomination Details',NULL,NULL,NULL,1,NULL,NULL,1,'00000016.TIF','2011-02-08 13:02:11.000','isaac','2012-01-12 08:52:31.000','henrey','Multi',4,1.0,0,'-',NULL
);
INSERT 0 1
file-sample
1,'6',1,'Details',2011-02-22 00:00:00.000,NULL,NULL,1,NULL,NULL,2,'00000001.TIF',2011-02-08 09:42:24.000,'kevin',2011-10-27 09:08:42.000,'james','Multi',1,1.0,0,'-',NULL
2,'6',2,'Bio data',NULL,NULL,NULL,1,NULL,NULL,2,'00000002.TIF',2011-02-08 10:25:11.000,'kevin',2012-11-19 16:20:49.000,'pattie','Multi',4,1.0,0,'-',NULL
4,'10',1,'Details',2011-02-22 00:00:00.000,NULL,NULL,1,NULL,NULL,2,'00000004.TIF',2011-02-08 10:43:38.000,'kevin',2014-07-18 10:46:06.000,'brian','Multi',1,1.0,0,'-',NULL
pgloader commands
pgloader --type csv --with truncate --with "fields optionally enclosed by '''" --with "fields terminated by ','" --set "search_path to 'raw'" - "postgresql://postgres:postgres#localhost/doc_db?a2" < null_test
Table
Table "raw.a2"
Column | Type | Collation | Nullable | Default
-------------+-----------------------------+-----------+----------+---------
num | integer | | not null |
f_fileno | character varying(15) | | |
f_foliono | integer | | |
f_doc_type | character varying(50) | | |
f_docdate | timestamp without time zone | | |
f_batch | integer | | |
f_box | integer | | |
f_blucpy | integer | | |
f_routopost | integer | | |
f_routousr | character varying(49) | | |
f_wfcreate | integer | | |
linkedfile | character varying(255) | | |
datecreate | timestamp without time zone | | |
usercreate | character varying(50) | | |
dateupdate | timestamp without time zone | | |
userupdate | character varying(50) | | |
media | character varying(5) | | |
pgcount | smallint | | |
groupnum | double precision | | |
subject | smallint | | |
pri | character varying(1) | | |
f_filecat | character varying(50) | | |
Indexes:
"a2_pkey" PRIMARY KEY, btree (num)
Output/Error
2019-07-24T05:55:24.231000Z WARNING Target table "\"raw\".\"a2\"" has 1 indexes defined against it.
2019-07-24T05:55:24.237000Z WARNING That could impact loading performance badly.
2019-07-24T05:55:24.237000Z WARNING Consider the option 'drop indexes'.
2019-07-24T05:55:24.460000Z ERROR PostgreSQL ["\"raw\".\"a2\""] Database error 22P02: invalid input syntax for integer: "NULL"
CONTEXT: COPY a2, line 1, column f_batch: "NULL"
2019-07-24T05:55:24.461000Z ERROR PostgreSQL ["\"raw\".\"a2\""] Database error 22007: invalid input syntax for type timestamp: "NULL"
CONTEXT: COPY a2, line 1, column f_docdate: "NULL"
From the docs of pgloader:
null if
This option takes an argument which is either the keyword blanks or a
double-quoted string.
When blanks is used and the field value that is read contains only
space characters, then it’s automatically converted to an SQL NULL
value.
When a double-quoted string is used and that string is read as the
field value, then the field value is automatically converted to an SQL
NULL value.
Looks like you're missing --with 'null if "NULL"' in your command.
Otherwise, you should be able to load the CSV data directly from psql:
\copy raw.a2 (NUM, F_FILENO, F_FOLIONO, F_DOC_TYPE, F_DOCDATE, F_BATCH, F_BOX, F_BLUCPY, F_ROUTOPOST, F_ROUTOUSR, F_WFCREATE, LINKEDFILE, DATECREATE, USERCREATE, DATEUPDATE, USERUPDATE, MEDIA, PGCOUNT, GROUPNUM, SUBJECT, PRI, F_FILECAT) FROM 'file.csv' WITH (FORMAT csv, NULL 'NULL')

GIST index creation too slow on PostgreSQL

I have a database in PostgreSQL with the following structure:
Column | Type | Collation | Nullable | Default
-------------+-----------------------+-----------+----------+------------------------------------------------
vessel_hash | integer | | not null | nextval('samplecol_vessel_hash_seq'::regclass)
status | character varying(50) | | |
station | character varying(50) | | |
speed | character varying(10) | | |
longitude | numeric(12,8) | | |
latitude | numeric(12,8) | | |
course | character varying(50) | | |
heading | character varying(50) | | |
timestamp | character varying(50) | | |
the_geom | geometry | | |
Check constraints:
"enforce_dims_the_geom" CHECK (st_ndims(the_geom) = 2)
"enforce_geotype_geom" CHECK (geometrytype(the_geom) = 'POINT'::text OR the_geom IS NULL)
"enforce_srid_the_geom" CHECK (st_srid(the_geom) = 4326)
The database contains ~146.000.000 records and the size of table that contains the data is:
public | samplecol | table | postgres | 31 GB |
I try to create a GIST index on the geometry field the_geom with this command:
create index samplecol_the_geom_gist on samplecol using gist (the_geom);
but takes too long. It runs 2 hours already.
Based on this question Slow indexing of 300GB Postgis table
Ask Question, before index creation I execute in psql console:
ALTER SYSTEM SET maintenance_work_mem = '1GB';
ALTER SYSTEM
SELCT pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)
But index creation takes too long. Does anyone know why? An how to fix this?
I am afraid you'll have to sit it out.
Apart from high maintenance_work_mem, there is not really a tuning option.
Increasing max_wal_size will help somewhat, since you will get fewer checkpoints.
If you can't live with an ACCESS EXCLUSIVE lock for that long, try CREATE INDEX CONCURRENTLY, which will be even slower, but won't block concurrent database activity.

Postgres: return true/false if matches found from inner join?

I have implemented a tagging system in my application, using Postgres 9.6. There are three tables.
Projects
Table "public.project"
Column | Type | Collation | Nullable | Default
-------------+-----------------------------+-----------+----------+---------------------------------
id | integer | | not null | nextval('tag_id_seq'::regclass)
name | character varying(255) | | not null |
user_id | integer | | |
Tags
Table "public.tag"
Column | Type | Collation | Nullable | Default
-------------+-----------------------------+-----------+----------+---------------------------------
id | integer | | not null | nextval('tag_id_seq'::regclass)
tag | character varying(255) | | not null |
user_id | integer | | |
is_internal | boolean | | not null | false
Project tags
Column | Type | Collation | Nullable | Default
------------------+-----------------------------+-----------+----------+-----------------------------------------
id | integer | | not null | nextval('project_tag_id_seq'::regclass)
tag_id | integer | | not null |
project_id | integer | | | |
user_id | integer | | not null |
Now I want to get a list of all the projects, annotated with a column that indicates (for a particular tag) whether it has that tag.
So I'd like the results to look like this:
id name has_favorite_tag
1 foo true
2 bar false
3 baz false
This is my query so far:
select project.*, CASE(XXXX) as has_project_tag
from project p
join (select * from project_tag where tag_id=1) pt on p.id=pt.project_id
I know that I want to use CASE to be true when the length of project_tag matches is greater than 0 - but how do I do this?
(In reality the project table has many more fields, of course.)
Here's a possibility (unfiltered for tag_id; add to inner select if necessary):
select project.*, exists(select * from project_tag where id=project.id) as has_project_tag from project;

PostgreSQL two groups segregated but not ordered only by zero price column

I need help with a bit of a crazy single-query goal please that I'm not sure if GROUP BY or sub-SELECT applies to?
The following query:
SELECT id_finish, description, inside_rate, outside_material, id_part, id_metal
FROM parts_finishing AS pf
LEFT JOIN parts_finishing_descriptions AS fd ON (pf.id_description=fd.id);
Returns the results like the following:
+-------------+-------------+------------------+--------------------------------+
| description | inside_rate | outside_material | id_part - id_finish - id_metal |
+-------------+-------------+------------------+--------------------------------+
| Nickle | 0 | 33.44 | 4444-44-44, 5555-55-55 |
+-------------+-------------+------------------+--------------------------------+
| Bend | 11.22 | 0 | 1111-11-11 |
+-------------+-------------+------------------+--------------------------------+
| Pack | 22.33 | 0 | 2222-22-22, 3333-33-33 |
+-------------+-------------+------------------+--------------------------------+
| Zinc | 0 | 44.55 | 6000-66-66 |
+-------------+-------------+------------------+--------------------------------+
I need the results to return in the fashion below but there are catches:
I need to group by either the inside_rate column or the outside_material column but ORDER BY the description column but not ORDER BY or sort them by price (inside_rate and outside_material are the prices). So we know that they belong to a group if inside_rate is 0 or to the other group if outside_material is 0.
I need to ORDER BY the description column desc secondary after they are returned per group.
I need to return a list of parts (composed of three separate columns) for that inside/outside group / price for that finishing.
Stack format fix.
+-------------+-------------+------------------+--------------------------------+
| description | inside_rate | outside_material | id_part - id_finish - id_metal |
+-------------+-------------+------------------+--------------------------------+
| Bend | 11.22 | 0 | 1111-11-11 |
+-------------+-------------+------------------+--------------------------------+
| Pack | 22.33 | 0 | 2222-22-22, 3333-33-33 |
+-------------+-------------+------------------+--------------------------------+
| Nickle | 0 | 33.44 | 4444-44-44, 5555-55-55 |
+-------------+-------------+------------------+--------------------------------+
| Zinc | 0 | 44.55 | 6000-66-66 |
+-------------+-------------+------------------+--------------------------------+
The tables I'm working with and their data types:
Table "public.parts_finishing"
Column | Type | Modifiers
------------------+---------+-------------------------------------------------------------
id | bigint | not null default nextval('parts_finishing_id_seq'::regclass)
id_part | bigint |
id_finish | bigint |
id_metal | bigint |
id_description | bigint |
date | date |
inside_hours_k | numeric |
inside_rate | numeric |
outside_material | numeric |
sort | integer |
Indexes:
"parts_finishing_pkey" PRIMARY KEY, btree (id)
Table "public.parts_finishing_descriptions"
Column | Type | Modifiers
------------+---------+------------------------------------------------------------------
id not null | bigint | default nextval('parts_finishing_descriptions_id_seq'::regclass)
date | date |
description | text |
rate_hour | numeric |
type | text |
Indexes:
"parts_finishing_descriptions_pkey" PRIMARY KEY, btree (id)
The second table's first column is just id. (Why are we still dealing with a 1024 static width layout in 2015?)
I'd make an SQL fiddle though it refuses to load for me regardless of the browser.
Not entirely sure I understand your question. Might look like this:
SELECT pd.description, pf.inside_rate, pf.outside_material
, concat_ws(' - ', pf.id_part::text
, pf.id_finish::text
, pf.id_metal::text) AS id_part_finish_metal
FROM parts_finishing pf
LEFT JOIN parts_finishing_descriptions fd ON pf.id_description = fd.id
ORDER BY (pf.inside_rate = 0) -- 1. sorts group "inside_rate" first
, pd.description DESC NULLS LAST -- 2. possible NULL values last
;