How to select rows in a postgres table based on a date and store it in a new table? - postgresql

I have a postgres table which has some data.Each row has a date associated with it.I want to extract rows for the dates which has the month as April.Here is a csv version of my postgres table data
,date,location,device,provider,cpu,mem,load,drops,id,latency,gw_latency,upload,download,sap_drops,sap_latency,alert_id
0,2018-02-10 11:52:59.342269+00:00,BEM,10.11.100.1,COD,6.0,23.0,11.75,0.0,,,,,,,,
1,2018-02-10 11:53:04.006971+00:00,VER,10.11.100.1,KOD,6.0,23.0,4.58,0.0,,,,,,,,
2,2018-03-25 20:28:36.186015+00:00,RET,10.11.100.1,POL,7.0,26.0,9.83,0.0,,86.328,5.0,4.33,15.33,0.0,23.0,
3,2018-03-25 20:28:59.155453+00:00,ASR,10.12.100.1,VOL,5.0,14.0,2.67,0.0,,52.406,12.0,2.17,3.17,0.0,28.0,
4,2018-04-01 13:16:44.472119+00:00,RED,10.19.0.1,SEW,6.0,14.0,2.77,0.0,,52.766,2.0,3.25,2.29,0.0,1.0,0.0
5,2018-04-01 13:16:48.478708+00:00,RED,10.19.0.1,POL,6.0,14.0,4.065,0.0,,52.766,1.0,6.63,1.5,0.0,1.0,0.0
6,2018-04-06 21:00:44.769702+00:00,GOK,10.61.100.1,FDE,4.0,22.0,3.08,0.0,,54.406,8.0,3.33,2.83,0.0,19.0,0.0
7,2018-04-06 21:01:07.211395+00:00,WER,10.4.100.1,FDE,3.0,3.0,9.28,0.0,,0.346,2.0,10.54,8.02,0.0,33.0,0.0
8,2018-04-13 11:18:08.411550+00:00,DER,10.19.0.1,CVE,14.0,14.0,7.88,0.0,,50.545,2.0,6.17,9.59,0.0,1.0,0.0
9,2018-04-13 11:18:12.420974+00:00,RTR,10.19.0.1,BOL,14.0,14.0,1.345,0.0,,50.545,1.0,2.26,0.43,0.0,1.0,0.0
So I want only the rows which has a month of april data such that I will have a table which looks something like this
4,2018-04-01 13:16:44.472119+00:00,RED,10.19.0.1,SEW,6.0,14.0,2.77,0.0,,52.766,2.0,3.25,2.29,0.0,1.0,0.0
5,2018-04-01 13:16:48.478708+00:00,RED,10.19.0.1,POL,6.0,14.0,4.065,0.0,,52.766,1.0,6.63,1.5,0.0,1.0,0.0
6,2018-04-06 21:00:44.769702+00:00,GOK,10.61.100.1,FDE,4.0,22.0,3.08,0.0,,54.406,8.0,3.33,2.83,0.0,19.0,0.0
7,2018-04-06 21:01:07.211395+00:00,WER,10.4.100.1,FDE,3.0,3.0,9.28,0.0,,0.346,2.0,10.54,8.02,0.0,33.0,0.0
8,2018-04-13 11:18:08.411550+00:00,DER,10.19.0.1,CVE,14.0,14.0,7.88,0.0,,50.545,2.0,6.17,9.59,0.0,1.0,0.0
9,2018-04-13 11:18:12.420974+00:00,RTR,10.19.0.1,BOL,14.0,14.0,1.345,0.0,,50.545,1.0,2.26,0.43,0.0,1.0,0.0
Now If I try to extract a particular date with the below query
select * from metrics_data where date = 2018-04-13;
I get the error message
No operator matches the given name and argument type(s). You might need to add explicit type casts.
How do I get the rows for the month of April and store it in a new table say april_data?
Below is the structure of my existing table
Column | Type | Modifiers | Storage | Stats target | Description
-------------+--------------------------+-----------+----------+--------------+-------------
date | timestamp with time zone | | plain | |
location | character varying(255) | | extended | |
device | character varying(255) | | extended | |
provider | character varying(255) | | extended | |
cpu | double precision | | plain | |
mem | double precision | | plain | |
load | double precision | | plain | |
drops | double precision | | plain | |
id | integer | | plain | |
latency | double precision | | plain | |
gw_latency | double precision | | plain | |
upload | double precision | | plain | |
download | double precision | | plain | |
sap_drops | double precision | | plain | |
sap_latency | double precision | | plain | |
alert_id | double precision | | plain | |

The type of column date in your table is timestamp with time zone which format will be YYYY:MM:DD HH24:MI:SS.MS. In the query you make an operation timestamp with time zone = date, so it will throw an error.
So, if you want to fix it, you should fix one side to the type of other.
In your case I suggest as below:
Match exact 1 day.
select * from metrics_data where date(date) = '2018-04-13';
Match within 1 month.
select * from metrics_data where date BETWEEN '2018-04-01 00:00:00' AND '2018-04-30 23:59:59.999';
OR
select * from metrics_data where date(date) BETWEEN '2018-04-01' AND '2018-04-30';
OR
select * from metrics_data where to_char(date,'YYYY-MM') = '2018-04';
Match only April.
select * from metrics_data where to_char(date,'MM') = '04';
OR
select * from metrics_data where extract(month from date) = 4;
Hopefully this answer will help you.

You need single quotes around the string literal.
PostgreSQL will automatically cast it to the correct data type (timestamp with time zone).
You could use the extract function to select only the dates from April:
SELECT * FROM yourtable WHERE extract (month FROM yourtable.date) = 4;

Related

Any way to find and delete almost similar records with SQL?

I have a table in Postgres DB, that has a lot of almost identical rows. For example:
1. 00Zicky_-_San_Pedro_Danilo_Vigorito_Remix
2. 00Zicky_-_San_Pedro__Danilo_Vigorito_Remix__
3. 0101_-_Try_To_Say__Strictlyjaz_Unit_Future_Rmx__
4. 0101_-_Try_To_Say__Strictlyjaz_Unit_Future_Rmx_
5. 01_-_Digital_Excitation_-_Brothers_Gonna_Work_it_Out__Piano_Mix__
6. 01_-_Digital_Excitation_-_Brothers_Gonna_Work_it_Out__Piano_Mix__
I think about to writing a little golang script to remove duplicates, but maybe SQL can do it?
Table definition:
\d+ songs
Table "public.songs"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
---------------+-----------------------------+-----------+----------+----------------------------------------+----------+-------------+--------------+-------------
song_id | integer | | not null | nextval('songs_song_id_seq'::regclass) | plain | | |
song_name | character varying(250) | | not null | | extended | | |
fingerprinted | smallint | | | 0 | plain | | |
file_sha1 | bytea | | | | extended | | |
total_hashes | integer | | not null | 0 | plain | | |
date_created | timestamp without time zone | | not null | now() | plain | | |
date_modified | timestamp without time zone | | not null | now() | plain | | |
Indexes:
"pk_songs_song_id" PRIMARY KEY, btree (song_id)
Referenced by:
TABLE "fingerprints" CONSTRAINT "fk_fingerprints_song_id" FOREIGN KEY (song_id) REFERENCES songs(song_id) ON DELETE CASCADE
Access method: heap
Tried several methods to find duplicates, but that methods search only for exact similarity.
There is no operator which is essentially A almost = B. (Well there is full text search, but that seems to be a little excessive here.) If the only difference is the number of - and _ then just get rid of them and compare the the resulting difference. If they are equal, then one is a duplicate. You can use the replace() function to remove them. So something like: (see demo)
delete
from songs s2
where exists ( select null
from songs s1
where s1.song_id < s2.song_id
and replace(replace(s1.name, '_',''),'-','') =
replace(replace(s2.name, '_',''),'-','')
);
If your table is large this will not be fast, but a functional index may help:
create index song_name_idx on songs
(replace(replace(name, '_',''),'-',''));

how to restore data values that already converted into scientific notation in a table

so i have a problem where i inserted some values into my table that the value automatically converted into a scientific notation (ex: 8.24e+04) does anyone know how restore the original value or how keep the original values in the table?
i'm using double precision as data type for the column and i just noticed that double precision data type often convert long number values into scientific notation.
this is how table looks like after i inserted some values
test=# select * from demo;
| string_col | values |
|------------|-----------------------|
| Rocket | 123228435521 |
| Test | 13328422942213 |
| Power | 1.243343991231232e+15 |
| Pull | 1.233433459353712e+15 |
| Drag | 1244375399128 |
edb=# \d+ demo;
Table "public.demo"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
------------+-----------------------+-----------+----------+---------+----------+--------------+-------------
string_col | character varying(20) | | | | extended | |
values | double precision | | | | plain | |
Access method: heap
this just some dummy table i used to explain my question here.
You'll have to format the number using to_char if you want it in a specific format:
SELECT 31672516735473059594023526::double precision,
to_char(
31672516735473059594023526::double precision,
'999999999999999999999999999.99999999999999FM'
);
float8 │ to_char
═══════════════════════╪════════════════════════════
3.167251673547306e+25 │ 31672516735473058997862400
(1 row)
The result is not exact because the precision of double precision is not high enough.
If you don't want the rounding errors and want to avoid scientific notation as well, use the data type numeric instead.

Why can't I use a plsql argument in this where clause?

I have a function below (is_organizer) that works, and lets me use this method as a computed field in Hasura. The function below (is_chapter_member) which is almost identical, doesn't work.
WORKS
CREATE OR REPLACE FUNCTION is_organizer(event_row events, hasura_session json)
RETURNS boolean AS $$
SELECT EXISTS (
SELECT 1
FROM event_organizers o
WHERE
o.user_id::text = hasura_session->>'x-hasura-user-id'
AND
(event_row.id = o.event_id OR event_row.event_template_id = o.event_template_id)
);
$$ LANGUAGE SQL STRICT IMMUTABLE;
BROKEN
CREATE OR REPLACE FUNCTION is_chapter_member(c chapters, hasura_session json)
RETURNS boolean AS $$
SELECT EXISTS (
SELECT 1
FROM chapter_members m
WHERE
m.user_id::text = hasura_session->>'x-hasura-user-id'
AND
c.chapter_id = m.chapter_id
);
$$ LANGUAGE SQL STRICT IMMUTABLE;
When attempting to add this function (not call it, just create it) Postgres gives me the following error:
ERROR: missing FROM-clause entry for table "c"
LINE 9: c.chapter_id = m.chapter_id
Why would a function param need a where clause? Table dumps below...
Table "public.chapters"
Column | Type | Collation | Nullable | Default
-----------------+--------------------------+-----------+----------+--------------------------------------
id | integer | | not null | nextval('chapters_id_seq'::regclass)
title | text | | not null |
slug | text | | not null |
description | jsonb | | |
avatar_url | text | | |
photo_url | text | | |
region | text | | |
maps_api_result | jsonb | | |
lat | numeric(11,8) | | |
lng | numeric(11,8) | | |
created_at | timestamp with time zone | | not null | now()
updated_at | timestamp with time zone | | not null | now()
deleted_at | timestamp with time zone | | |
Table "public.chapter_members"
Column | Type | Collation | Nullable | Default
------------+--------------------------+-----------+----------+---------
user_id | integer | | not null |
chapter_id | integer | | not null |
created_at | timestamp with time zone | | not null | now()
updated_at | timestamp with time zone | | not null | now()
Table "public.events"
Column | Type | Collation | Nullable | Default
-------------------+-----------------------------+-----------+----------+---------------------------------------------------
id | integer | | not null | nextval('events_id_seq'::regclass)
event_template_id | integer | | not null |
venue_id | integer | | |
starts_at | timestamp without time zone | | not null |
duration | interval | | not null |
title | text | | |
slug | text | | |
description | text | | |
photo_url | text | | |
created_at | timestamp without time zone | | not null | now()
updated_at | timestamp without time zone | | not null | now()
deleted_at | timestamp without time zone | | |
ends_at | timestamp without time zone | | | generated always as (starts_at + duration) stored
Table "public.event_organizers"
Column | Type | Collation | Nullable | Default
-------------------+---------+-----------+----------+----------------------------------------------
id | integer | | not null | nextval('event_organizers_id_seq'::regclass)
user_id | integer | | not null |
event_id | integer | | |
event_template_id | integer | | |
This turned out to be using an incorrect column name in the broken function. chapter_id should have just been id on the c argument. I took Richard's prompt and tried putting parens around the arg like (c).chapter_id. This then correctly told me that chapter_id doesn't exist, and allowed me to fix the issue.

Sorting Issue with Underscore in Postgres

I'm trying to perform sorting on below data but postgres return the wrong sorting result.
Can someone please help me over her. How can I get proper sorting data.
Here I'm write below query to get data,
SELECT * FROM TempTable ORDER BY a_test ASC NULLS FIRST;
and it's return result like below,
| BB001217 |
| BB001217_000010 |
| BB001217_000011 |
| BB001217_00002 |
| BB001217_00003 |
| BB001218 |
| BB001219 |
| BB001220 |
| BB001220_000010 |
| BB001220_000011 |
| BB001220_00002 |
| BB001220_00003 |
| BB001220_00004 |
| BB001220_00005 |
| BB001220_00006 |
And I Expected result in below form,
| BB001217 |
| BB001217_00002 |
| BB001217_00003 |
| BB001217_000010 |
| BB001217_000011 |
| BB001218 |
| BB001219 |
| BB001220 |
| BB001220_00002 |
| BB001220_00003 |
| BB001220_00004 |
| BB001220_00005 |
| BB001220_00006 |
| BB001220_000010 |
| BB001220_000011 |
From PostgreSQL v10 on you could use an ICU collation that provides “natural sorting”:
CREATE COLLATION english_natural (
LOCALE = 'en-US-u-kn-true',
PROVIDER = icu
);
SELECT *
FROM TempTable
ORDER BY a_test COLLATE english_natural
ASC NULLS FIRST;
You are storing numbers in a VARCHAR column and the sorting is thus based on character sorting where '10' is considered to be smaller than '2'
You need to split the column into two parts, then convert the second to a number and sort on those two:
SELECT *
FROM temptable
ORDER BY split_part(a_test,'_',1),
nullif(split_part(a_test,'_',2),'')::int ASC NULLS FIRST;
Online example: https://rextester.com/RNU44666

PostgreSQL two groups segregated but not ordered only by zero price column

I need help with a bit of a crazy single-query goal please that I'm not sure if GROUP BY or sub-SELECT applies to?
The following query:
SELECT id_finish, description, inside_rate, outside_material, id_part, id_metal
FROM parts_finishing AS pf
LEFT JOIN parts_finishing_descriptions AS fd ON (pf.id_description=fd.id);
Returns the results like the following:
+-------------+-------------+------------------+--------------------------------+
| description | inside_rate | outside_material | id_part - id_finish - id_metal |
+-------------+-------------+------------------+--------------------------------+
| Nickle | 0 | 33.44 | 4444-44-44, 5555-55-55 |
+-------------+-------------+------------------+--------------------------------+
| Bend | 11.22 | 0 | 1111-11-11 |
+-------------+-------------+------------------+--------------------------------+
| Pack | 22.33 | 0 | 2222-22-22, 3333-33-33 |
+-------------+-------------+------------------+--------------------------------+
| Zinc | 0 | 44.55 | 6000-66-66 |
+-------------+-------------+------------------+--------------------------------+
I need the results to return in the fashion below but there are catches:
I need to group by either the inside_rate column or the outside_material column but ORDER BY the description column but not ORDER BY or sort them by price (inside_rate and outside_material are the prices). So we know that they belong to a group if inside_rate is 0 or to the other group if outside_material is 0.
I need to ORDER BY the description column desc secondary after they are returned per group.
I need to return a list of parts (composed of three separate columns) for that inside/outside group / price for that finishing.
Stack format fix.
+-------------+-------------+------------------+--------------------------------+
| description | inside_rate | outside_material | id_part - id_finish - id_metal |
+-------------+-------------+------------------+--------------------------------+
| Bend | 11.22 | 0 | 1111-11-11 |
+-------------+-------------+------------------+--------------------------------+
| Pack | 22.33 | 0 | 2222-22-22, 3333-33-33 |
+-------------+-------------+------------------+--------------------------------+
| Nickle | 0 | 33.44 | 4444-44-44, 5555-55-55 |
+-------------+-------------+------------------+--------------------------------+
| Zinc | 0 | 44.55 | 6000-66-66 |
+-------------+-------------+------------------+--------------------------------+
The tables I'm working with and their data types:
Table "public.parts_finishing"
Column | Type | Modifiers
------------------+---------+-------------------------------------------------------------
id | bigint | not null default nextval('parts_finishing_id_seq'::regclass)
id_part | bigint |
id_finish | bigint |
id_metal | bigint |
id_description | bigint |
date | date |
inside_hours_k | numeric |
inside_rate | numeric |
outside_material | numeric |
sort | integer |
Indexes:
"parts_finishing_pkey" PRIMARY KEY, btree (id)
Table "public.parts_finishing_descriptions"
Column | Type | Modifiers
------------+---------+------------------------------------------------------------------
id not null | bigint | default nextval('parts_finishing_descriptions_id_seq'::regclass)
date | date |
description | text |
rate_hour | numeric |
type | text |
Indexes:
"parts_finishing_descriptions_pkey" PRIMARY KEY, btree (id)
The second table's first column is just id. (Why are we still dealing with a 1024 static width layout in 2015?)
I'd make an SQL fiddle though it refuses to load for me regardless of the browser.
Not entirely sure I understand your question. Might look like this:
SELECT pd.description, pf.inside_rate, pf.outside_material
, concat_ws(' - ', pf.id_part::text
, pf.id_finish::text
, pf.id_metal::text) AS id_part_finish_metal
FROM parts_finishing pf
LEFT JOIN parts_finishing_descriptions fd ON pf.id_description = fd.id
ORDER BY (pf.inside_rate = 0) -- 1. sorts group "inside_rate" first
, pd.description DESC NULLS LAST -- 2. possible NULL values last
;