This question already has answers here:
in postgres, can you set the default formatting for a timestamp, by session or globally?
(3 answers)
Closed 5 years ago.
I am brand new to PostgreSQL coming from a few years in a company who solely using MySQL and I am a little caught off guard by the TIMESTAMP type:
CREATE TABLE example (
example_id SERIAL PRIMARY KEY,
time_utc TIMESTAMP
);
INSERT INTO example (time_utc) VALUES (NOW());
SELECT * FROM example;
example_id | time_utc
------------+----------------------------
1 | 2017-11-02 21:37:26.592814
I am aware that I can simply cast the field to its less precise form:
SELECT example_id, time_utc::timestamp(0)
and am also aware that I can declare the precision in the table definition:
time_utc TIMESTAMP(0)
But is there a way to change this to be the default precision for all TIMESTAMP fields? And similarly, is there a way to change the behavior of NOW() to match this format (including its lack of timezone)?
For example in MySQL:
SELECT NOW();
+---------------------+
| NOW() |
+---------------------+
| 2017-11-02 21:41:26 |
+---------------------+
PostgreSQL:
SELECT NOW();
now
-------------------------------
2017-11-02 21:42:48.855801+00
Honestly I just can't think of any time in the past I have wished for more precision out of MySQL's timestamps and the less precise form is objectively easier on the eyes. Is this an easy configuration change or is this just something I need to suck up and deal with in my transition to PostgreSQL?
Thanks
The date/time style can be selected by the user using the SET datestyle
command, the DateStyle parameter in the postgresql.conf
configuration file, or the PGDATESTYLE environment variable on the
server or client.
https://www.postgresql.org/docs/current/static/datatype-datetime.html#datatype-datetime-output
The available options (masks) are also listed at the same reference. But they may not provide for what you want exactly. Beyond that you have the to_char() function
The formatting function to_char (see Section 9.8) is also available as
a more flexible way to format date/time output.
https://www.postgresql.org/docs/current/static/functions-formatting.html
Related
SELECT * FROM items WHERE created_time >= 20210505143012999
on mySQL, we can give condition like WHERE created_time >= 20210505143012999.
but, I want to find it with similar format(20210505143012999) on PostgreSQL.. How can I do this?
Seems mySQL is a little lax with data types (or perhaps just more forgiving), in Postgres your value is just an number (bigint). You need to convert it with the to_timestamp function. But as an epoch it seems Postgres does not like and it also does not appear to be an epoch either. You can either pre-convert to a string then use the to_timestamp of cast it as test within the function parameters. Either way specify the format: (see demo)
select to_timestamp ('20210505143012999', 'yyyymmddhh24missms');
select to_timestamp (20210505143012999::text, 'yyyymmddhh24missms');
I'm working myself through the Datacamp SQL track, and I'm currently working with date values. I've encountered two examples which seem contradictory to me.
-- Count requests created on January 31, 2017
SELECT count(*)
FROM evanston311
WHERE date_created::date='2017-01-31';
And:
-- Count requests created on February 29, 2016
SELECT count(*)
FROM evanston311
WHERE date_created>= '2016-02-29'
AND date_created< '2016-03-01';
Why do I need to cast the value as date in the first case but not the other?
As with most typed languages, you can rely on implicit type casting... until you can't.
Something like date_created >= '2016-02-29' Postgres can use the type of date_created to figure out how to implicitly cast '2016-02-29'. There's no ambiguity. But sometimes Postgres can't make a guess at all.
OTOH a function like date_part has multiple signatures date_part(text, timestamp) and date_part(text, interval). If you pass it a date string...
test=# select date_part('day', '2019-01-03');
ERROR: function date_part(unknown, unknown) is not unique
LINE 1: select date_part('day', '2019-01-03');
^
HINT: Could not choose a best candidate function. You might need to add explicit type casts.
...Postgres cannot make a guess because the second string could be interpreted as either a timestamp or an interval type. You need to resolve this ambiguity.
# select date_part('day', '2019-01-03'::date);
date_part
-----------
3
Now that Postgres knows you're passing in a date it can correctly guess to use it as a timestamp.
Another reason is as a cheap way to truncate timestamps. In your example date_created::date = '2017-01-31' will truncate date_created to be a date and make the comparison work. Of course, date_created should already be a date...
You can use it on the value being compared if you're not sure if that value will be a date or a timestamp.
select * from table
where date_created = $1::date
This will work the same with '2019-01-02' or '2019-01-02 03:04:05'.
Which brings us to our final reason: making up for bad schemas. Like if date_created is actually a timestamp, or all too common, text. In that case you need to explicitly control how comparisons are made. For example, let's say we had text_created of type text that contained timestamps as strings: naught. And maybe some poorly formatted data crept in that has extra spaces on the end...
-- Text comparison compares the values exactly.
test=# select * from test where text_created = '2019-01-04';
date_created | time_created | text_created
--------------+--------------+--------------
-- Date comparison compares as dates ignoring the extra whitespace.
test=# select * from test where text_created::date = '2019-01-04';
date_created | time_created | text_created
--------------+--------------+--------------
| | 2019-01-04
See Chapter 10. Type Conversion in the Postgres docs for more.
So I have two external tables in Hive, in my Hadoop cluster.
One table has a (date STRING) column, with this format '2019-05-24 11:16:31.0'
and the other one has (date STRING) column, with this format '23/May/2019:22:15:04', they are both strings. I need to transform them to the same type of date format and use them to join these two tables.
How would you aproach this problem solving it all within hive? Would it be possible? I'm quite the rookie in Hadoop, And I'm not fully aware of the possibilities of hive.
Ps: My hive version does not support !hive --version command to check what version I'm working with, so I'm not pretty sure how to understand what version I'm working on. Not my cluster and I'm not a root user.
You need to convert both strings to the same format before joining.
Converting non-standard format '23/May/2019:22:15:04'
Use unix_timestamp(string date, string pattern) to convert given date format to seconds passed from 1970-01-01. Then use from_unixtime() to convert to required format:
select from_unixtime(unix_timestamp('23/May/2019:22:15:04','dd/MMM/yyyy:HH:mm:ss'));
returns:
2019-05-23 22:15:04
If you want date only, specify date format 'yyyy-MM-dd' in the from_unixtime function:
select from_unixtime(unix_timestamp('23/May/2019:22:15:04','dd/MMM/yyyy:HH:mm:ss'),'yyyy-MM-dd');
Returns:
2019-05-23
Second table contains more standard format '2019-05-24 11:16:31.0' and you can do with simpler approach.
You can use simple substr, because the date is already in the Hive format 'yyyy-MM-dd':
select substr('2019-05-24 11:16:31.0',1,10);
Returns:
2019-05-24
Or if you want the same format as in the first example 'yyyy-MM-dd HH:mm:ss':
select substr('2019-05-24 11:16:31.0',1,19);
Returns:
2019-05-24 11:16:31
Also date_format (as of Hive 1.2.0) function can be used for the same:
select date_format('2019-05-24 11:16:31.0','yyyy-MM-dd HH:mm:ss');
Returns:
2019-05-24 11:16:31
And date portion only using date_format (as of Hive 1.2.0):
select date_format('2019-05-24 11:16:31.0','yyyy-MM-dd')
OK, you can use the String Functions and Operators in hive to make the two different date format to be same, like below:
select regexp_replace(substring('2019-05-24 11:16:31.0',0,10),'-','') as date;
+-----------+
| date |
+-----------+
| 20190524 |
+-----------+
select concat(split(substring_index('23/May/2019:22:15:04',':',1),'/')[2],case when split(substring_index('23/May/2019:22:15:04',':',1),'/')[1]='May' then '05' end,split(substring_index('23/May/2019:22:15:04',':',1),'/')[0]) as date;
+-----------+
| date |
+-----------+
| 20190523 |
+-----------+
And then join them, below is a simple example to clarify how to use, you can refine the details.
select
*
from
table1 t1
join
table2 t2 regexp_replace(substring(t1.date,0,10),'-','') = select concat(split(substring_index(t2.date,':',1),'/')[2],case when split(substring_index(t2.date,':',1),'/')[1]='May' then '05' end,split(substring_index(t2.date,':',1),'/')[0])
Am I make it clear?
I'm trying to construct very simple graph showing how much visits I've got in some period of time (for example for each 5 minutes).
I have Grafana of v. 5.4.0 paired well with Postgres v. 9.6 full of data.
My table below:
CREATE TABLE visit (
id serial CONSTRAINT visit_primary_key PRIMARY KEY,
user_credit_id INTEGER NOT NULL REFERENCES user_credit(id),
visit_date bigint NOT NULL,
visit_path varchar(128),
method varchar(8) NOT NULL DEFAULT 'GET'
);
Here's some data in it:
id | user_credit_id | visit_date | visit_path | method
----+----------------+---------------+---------------------------------------------+--------
1 | 1 | 1550094818029 | / | GET
2 | 1 | 1550094949537 | /mortgage/restapi/credit/{userId}/decrement | POST
3 | 1 | 1550094968651 | /mortgage/restapi/credit/{userId}/decrement | POST
4 | 1 | 1550094988557 | /mortgage/restapi/credit/{userId}/decrement | POST
5 | 1 | 1550094990820 | /index/UGiBGp0V | GET
6 | 1 | 1550094990929 | / | GET
7 | 2 | 1550095986310 | / | GET
...
So I tried these 3 variants (actually, dozens of others with no luck) with no success:
Solution A:
SELECT
visit_date as "time",
count(user_credit_id) AS "user_credit_id"
FROM visit
WHERE $__timeFilter(visit_date)
ORDER BY visit_date ASC
No data on graph. Error: pq: invalid input syntax for integer: "2019-02-14T13:16:50Z"
Solution B
SELECT
$__unixEpochFrom(visit_date),
count(user_credit_id) AS "user_credit_id"
FROM visit
GROUP BY time
ORDER BY user_credit_id
Series ASELECT
$__time(visit_date/1000,10m,previous),
count(user_credit_id) AS "user_credit_id A"
FROM
visit
WHERE
visit_date >= $__unixEpochFrom()::bigint*1000 and
visit_date <= $__unixEpochTo()::bigint*1000
GROUP BY 1
ORDER BY 1
No data on graph. No Error..
Solution C:
SELECT
$__timeGroup(visit_date, '1h'),
count(user_credit_id) AS "user_credit_id"
FROM visit
GROUP BY time
ORDER BY time
No data on graph. Error: pq: function pg_catalog.date_part(unknown, bigint) does not exist
Could someone please help me to sort out this simple problem as I think the query should be compact, naive and simple.. But Grafana docs demoing its syntax and features confuse me slightly.. Thanks in advance!
Use this query, which will works if visit_date is timestamptz:
SELECT
$__timeGroupAlias(visit_date,5m,0),
count(*) AS "count"
FROM visit
WHERE
$__timeFilter(visit_date)
GROUP BY 1
ORDER BY 1
But your visit_date is bigint so you need to convert it to timestamp (probably with TO_TIMESTAMP()) or you will need find other way how to use it with bigint. Use query inspector for debugging and you will see SQL generated by Grafana.
Jan Garaj, Thanks a lot! I should admit that your snippet and what's more valuable your additional comments advising to switch to SQL debugging dramatically helped me to make my "breakthrough".
So, the resulting query which solved my problem below:
SELECT
$__unixEpochGroup(visit_date/1000, '5m') AS "time",
count(user_credit_id) AS "Total Visits"
FROM visit
WHERE
'1970-01-01 00:00:00 GMT'::timestamp + ((visit_date/1000)::text)::interval BETWEEN
$__timeFrom()::timestamp
AND
$__timeTo()::timestamp
GROUP BY 1
ORDER BY 1
Several comments to decypher all this Grafana magic:
Grafana has its limited DSL to make configurable graphs, this set of functions converts into some meaningful SQL (this is where seeing "compiled" SQL helped me a lot, many thanks again).
To make my BIGINT column be appropriate for predefined Grafana functions we need to simply convert it to seconds from UNIX epoch so, in math language - just divide by 1000.
Now, WHERE statement seems not so simple and predictable, Grafana DSL works different where and simple division did not make trick and I solved it by using another Grafana functions to get FROM and TO points of time (period of time for which Graph should be rendered) but these functions generate timestamp type while we do have BIGINT in our column. So, thanks to Postgres we have a bunch of converter means to make it timestamp ('1970-01-01 00:00:00 GMT'::timestamp + ((visit_date/1000)::text)::interval - generates you one BIGINT value converted to Postgres TIMESTAMP with which Grafana deals just fine).
P.S. If you don't mind I've changed my question text to be more precise and detailed.
I store a time as VARCHAR(30) as we can see it below:
I know it is far from best practices.. there is some way to convert such a string into PostgreSQL's time ?
Simply casting can do the trick like:
SELECT time::timestamptz FROM table;
Proof:
SELECT '2016-08-12T15:15:01.100001Z'::timestamptz;
timestamptz
-------------------------------
2016-08-12 15:15:01.100001+00
(1 row)