When do I need to cast a value as a date? - postgresql

I'm working myself through the Datacamp SQL track, and I'm currently working with date values. I've encountered two examples which seem contradictory to me.
-- Count requests created on January 31, 2017
SELECT count(*)
FROM evanston311
WHERE date_created::date='2017-01-31';
And:
-- Count requests created on February 29, 2016
SELECT count(*)
FROM evanston311
WHERE date_created>= '2016-02-29'
AND date_created< '2016-03-01';
Why do I need to cast the value as date in the first case but not the other?

As with most typed languages, you can rely on implicit type casting... until you can't.
Something like date_created >= '2016-02-29' Postgres can use the type of date_created to figure out how to implicitly cast '2016-02-29'. There's no ambiguity. But sometimes Postgres can't make a guess at all.
OTOH a function like date_part has multiple signatures date_part(text, timestamp) and date_part(text, interval). If you pass it a date string...
test=# select date_part('day', '2019-01-03');
ERROR: function date_part(unknown, unknown) is not unique
LINE 1: select date_part('day', '2019-01-03');
^
HINT: Could not choose a best candidate function. You might need to add explicit type casts.
...Postgres cannot make a guess because the second string could be interpreted as either a timestamp or an interval type. You need to resolve this ambiguity.
# select date_part('day', '2019-01-03'::date);
date_part
-----------
3
Now that Postgres knows you're passing in a date it can correctly guess to use it as a timestamp.
Another reason is as a cheap way to truncate timestamps. In your example date_created::date = '2017-01-31' will truncate date_created to be a date and make the comparison work. Of course, date_created should already be a date...
You can use it on the value being compared if you're not sure if that value will be a date or a timestamp.
select * from table
where date_created = $1::date
This will work the same with '2019-01-02' or '2019-01-02 03:04:05'.
Which brings us to our final reason: making up for bad schemas. Like if date_created is actually a timestamp, or all too common, text. In that case you need to explicitly control how comparisons are made. For example, let's say we had text_created of type text that contained timestamps as strings: naught. And maybe some poorly formatted data crept in that has extra spaces on the end...
-- Text comparison compares the values exactly.
test=# select * from test where text_created = '2019-01-04';
date_created | time_created | text_created
--------------+--------------+--------------
-- Date comparison compares as dates ignoring the extra whitespace.
test=# select * from test where text_created::date = '2019-01-04';
date_created | time_created | text_created
--------------+--------------+--------------
| | 2019-01-04
See Chapter 10. Type Conversion in the Postgres docs for more.

Related

How to manipulate column data in postgres

I need to manipulate column data in postgres
When I run the query-
SELECT t.date::varchar
FROM generate_series(timestamp '2020-02-27'
, timestamp '2020-03-01'
, interval '1 day') AS t(date);
it returns -
2020-02-27 00:00:00
2020-02-28 00:00:00
2020-02-29 00:00:00
2020-03-01 00:00:00
I want -
20200227
20200228
20200229
20200301
Removed '-' and truncated from end.
Can someone guide
If you don't specifically need some features of a varchar, by default use text instead.
You don't need to cast every time - generate_series() will do that automatically once it detects your step is an interval. That's unless you specifically want the generate_series(timestamp,timestamp,inteval) variant, not generate_series(timestamptz,timestamptz,inteval).
If you cast to be explicit, cast your dates as date. Regardless of whether you leave it as text literal or make them actual dates, PostgreSQL will have to cast them to match the function definition.
If you're planning to group things by a text-based date or do that to truncate timestamps, consider date_bin() and date_trunc() as well as simply holding things as a native date type. It'll take less space, run faster and enable native date-specific functions.
Make sure you're using to_char() to its full potential - it can save a lot of formatting later.
SELECT to_char(t.date,'YYYYMMDD') as date
FROM generate_series('2020-02-27'
, '2020-03-01'
, interval '1 day') AS t(date);
-- date
------------
-- 20200227
-- 20200228
-- 20200229
-- 20200301

Postgres Converting Data Types

I have a column saved as a character data type. This column is what I am going to be using as a date. The column goes "YYYY-MM-DD" in that format.
This is a problem because if I ever need to filter by date, I have to go
select col_1, col_2
from table
where date LIKE '2016-04%;
If I want to search for a date range, this turns into a giant complicated mess.
What is the easiest way to convert this to a "date" data type? I want it to continue to be in YYYY-MM-DD order (no timestamp).
My ultimate goal is to be able to search for dates in a format like this:
select col_1, col_2
from table
where date between 2016-01-01 AND 2016-05-31;
What do you guys recommend? I am terrified I am going to corrupt my date if I use an alter statement to convert my data type. (I have a copy of the data saved and can upload it again, but it will take forever.)
Edit: This is a VERY Large table.
Edit Part 2: I originally stored the data as a varchar data type because my dates were not uploading correctly and I got an error message when I tried to save as a date data type. The every date in this column is in the "YYYY-MM-DD" order. My solution was to save it as varchar to avoid the error message (I couldn't figure out what was wrong. I even got rid of leading and trailing spaces.)
Storing a date as a varchar was the wrong choice to begin with. It's very good that you want to change that.
The first step is to convert the columns using an ALTER TABLE statement:
alter table the_table
ALTER COLUMN col_1 TYPE date using col_1::date,
ALTER COLUMN col_2 TYPE date using col_2::date;
Note that this will fail if you have any value in those columns that cannot be convert to a correct date. If you get that you need to first fix those invalid strings before you can change the data type.
I want it to continue to be in YYYY-MM-DD order
This is a misconception. A DATE (or timestamp) does not have a "format". Once it's stored as a date you can display it in any format you want.
My ultimate goal is to be able to search for dates in a format like this:
2016-01-01 is not a valid date literal, a proper (i.e. correctly typed) date constant can be specified e.g. using date '2016-01-01' (note the single quotes!
So your query becomes:
select col_1, col_2
from table
where col_1 between date '2016-01-01' AND date '2016-05-31';
If you have a lot of queries like that you should consider creating an index on the date columns.
Regarding the date constant format:
Are you telling me that despite having the varchar data types, I can still (as of right now) search between specific dates by just typing the word date and putting single quotes between two dates
No, that's not the case. SQL is a strongly typed language and as such will only compare values of the same type.
Using an ANSI date literal (or e.g. to_date()) results in a type constant (i.e. a value with a specific data type).
The difference between date '2016-01-01' and '2016-01-01' is the same as between42(a number) and'42'` (a string).
If you compare a string with a date, you are comparing apples and oranges and the database will do an implicit data type conversion from one type to the other. This is something that should be avoided at all costs.
If you do not want to change the table, you should use the query sagi provided which explicitly converts the strings to dates and then does the comparison on (real) date values (not strings)
You can use POSTGRES TO_DATE() cast function :
SELECT col_1,col_2
FROM Your_Table
WHERE to_date(date_col,'yyyy-mm-dd') between to_date('2016-05-31','yyyy-mm-dd') and to_date('2016-01-01','yyyy-mm-dd')
What #a_horse said.
Plus, if you can't change the data type for some odd reason, to_date() is a safe option to convert the column on the column, but there is no point to use the same expression for provided constants. So:
SELECT col_1, col_2
FROM tbl
WHERE to_date(date, 'YYYY-DD-MM') BETWEEN date '2016-05-31' AND date '2016-01-01';
Or just use string literals without type. The type date is deferred from the context in this expression. And you don't even need to_date(). Since you are using ISO format already. A plain cast is safe:
WHERE date::date BETWEEN '2016-05-31' AND '2016-01-01';
Be sure to use ISO 8601 format for all date strings, so they are unambiguous and valid with any locale.
You can even have an expression index to support the query. Match the actual expression used in queries:
CREATE INDEX tbl_date_idx ON tbl ((date::date)); -- parentheses required!
But I wouldn't use the basic type name date as identifier to begin with.

date_trunc on timestamp column returns nothing

I have a strange problem when retrieving records from db after comparing a truncated field with date_trunc().
This query doesn't return any data:
select id from my_db_log
where date_trunc('day',creation_date) >= to_date('2014-03-05'::text,'yyyy-mm-dd');
But if I add the column creation_date with id then it returns data(i.e. select id, creation_date...).
I have another column last_update_date having same type and when I use that one, still does the same behavior.
select id from my_db_log
where date_trunc('day',last_update_date) >= to_date('2014-03-05'::text,'yyyy-mm-dd');
Similar to previous one. it also returns record if I do id, last_update_date in my select.
Now to dig further, I have added both creation_date and last_updated_date in my where clause and this time it demands to have both of them in my select clause to have records(i.e. select id, creation_date, last_update_date).
Does anyone encountered the same problem ever? This similar thing works with my other tables which are having this type of columns!
If it helps, here is my table schema:
id serial NOT NULL,
creation_date timestamp without time zone NOT NULL DEFAULT now(),
last_update_date timestamp without time zone NOT NULL DEFAULT now(),
CONSTRAINT db_log_pkey PRIMARY KEY (id),
I have asked a different question earlier that didn't get any answer. This problem may be related to that one. If you are interested on that one, here is the link.
EDITS:: EXPLAIN (FORMAT XML) with select * returns:
<explain xmlns="http://www.postgresql.org/2009/explain">
<Query>
<Plan>
<Node-Type>Result</Node-Type>
<Startup-Cost>0.00</Startup-Cost>
<Total-Cost>0.00</Total-Cost>
<Plan-Rows>1000</Plan-Rows>
<Plan-Width>658</Plan-Width>
<Plans>
<Plan>
<Node-Type>Result</Node-Type>
<Parent-Relationship>Outer</Parent-Relationship>
<Alias>my_db_log</Alias>
<Startup-Cost>0.00</Startup-Cost>
<Total-Cost>0.00</Total-Cost>
<Plan-Rows>1000</Plan-Rows>
<Plan-Width>658</Plan-Width>
<Node/s>datanode1</Node/s>
<Coordinator-quals>(date_trunc('day'::text, creation_date) >= to_date('2014-03-05'::text, 'yyyy-mm-dd'::text))</Coordinator-quals>
</Plan>
</Plans>
</Plan>
</Query>
</explain>
"Impossible" phenomenon
The number of rows returned is completely independent of items in the SELECT clause. (But see #Craig's comment about SRFs.) Something must be broken in your db.
Maybe a broken covering index? When you throw in the additional column, you force Postgres to visit the table itself. Try to re-index:
REINDEX TABLE my_db_log;
The manual on REINDEX. Or:
VACUUM FULL ANALYZE my_db_log;
Better query
Either way, use instead:
select id from my_db_log
where creation_date >= '2014-03-05'::date
Or:
select id from my_db_log
where creation_date >= '2014-03-05 00:00'::timestamp
'2014-03-05' is in ISO 8601 format. You can just cast this string literal to date. No need for to_date(), works with any locale. The date is coerced to timestamp [without time zone] automatically when compared to creation_date (being timestamp [without time zone]). More details about timestamps in Postgres here:
Ignoring timezones altogether in Rails and PostgreSQL
Also, you gain nothing by throwing in date_trunc() here. On the contrary, your query will be slower and any plain index on the column cannot be used (potentially making this much slower)

how to insert the current system date and time in oracle10g database

I have created a table with a column date_time type (varchar2 (40) ) but when i try to insert the current system date and time the doesnt work it gives error (too many values). please tell me what's wrong with the insert statement.
create table HR (type varchar2 (20), raised_by number (6), complaint varchar2 (500), date_time varchar2(40))
insert into HR values ('request',6785,'good morning',sysdate,'YYYY/MM/DD:HH:MI:SSAM')
The immediate cause of the error is that you have too many values, as the message says; that is, more elements in your values clause than there are columns. It is better to explicitly list the column names to avoid future problems and confusion, so you're really doing this:
insert into HR (type, raised_by, complaint, date_time)
values ('request',6785,'good morning',sysdate,'YYYY/MM/DD:HH:MI:SSAM')
... sp you have four columns, but five values. You're trying to insert the current date/time as a string so you would need to use the to_char() function:
insert into HR (type, raised_by, complaint, date_time)
values ('request',6785,'good morning',
to_char(sysdate,'YYYY/MM/DD:HH:MI:SSAM'))
But it is bad practice to store a date (or any other structured data, such as a number) as a string. As the documentation notes:
Each value manipulated by Oracle Database has a data type. The data
type of a value associates a fixed set of properties with the value.
These properties cause Oracle to treat values of one data type
differently from values of another. For example, you can add values of
NUMBER data type, but not values of RAW data type.
If you use a string then you can put invalid values in. If you use a proper DATE data type then you cannot accidentally put an invalid or confusing value in. Oracle will also be able to optimise the use of the column, and will be able to compare values safely and efficiently. Although the format you're using is better than some, using string comparison you still can't easily compare two values to see which is earlier, so you can't properly order by the date_time column for example.
Say you inserted two rows with values 2013/11/15:09:00:00AM and 2013/11/15:08:00:00PM - which is earlier? You need to look at the AM/PM marker to realise the first one is earlier; with a string comparison you'd get it wrong because 8 would be sorted before 9. Using HH24 instead of HH and AM avoids that, but would still be less efficient than a true date.
If you need to store a date with a time component you can use the DATE data type, which has precision down to the second; or if you need fractional seconds too then you can use TIMESTAMP. Then your table and insert would be:
create table HR (type varchar2 (20), raised_by number (6),
complaint varchar2 (500), date_time date);
insert into HR (type, raised_by, complaint, date_time)
values ('request',6785,'good morning',sysdate);
You can still get the value in the format you wanted for display purposes as part of a query:
select type, raised_by, complaint,
to_char(date_time, 'YYYY/MM/DD:HH:MI:SSAM') as date_time
from HR
order by date_time;
TYPE RAISED_BY COMPLAINT DATE_TIME
-------------------- ---------- -------------------- ---------------------
request 6785 good morning 2013/11/15:08:44:35AM
Only treat a date as a string for display.
You can use TO_DATE() or TO_TIMESTAMP or To_char() function,
insert into HR values ('request',6785,'good morning',TO_DATE(sysdate, 'yyyy/mm/dd hh24:mi:ss'))
insert into HR values ('request',6785,'good morning',TO_TIMESTAMP(systimestamp, 'yyyy/mm/dd hh24:mi:ss'))
sysdate - It will give date with time.
systimestamp - It will give datetime with milliseconds.
To_date() - Used to convert string to date.
To_char() - Used to convert date to string.
Probably here you have to use To_char() because your table definition have varchar type for date_time column.
Use TIMESTAMP datatype for date_time. And while inserting use the current timestamp.
create table HR (type varchar2(20), raised_by number(6), complaint varchar2(500), date_time timestamp);
insert into HR values ('request',6785,'good morning', systimestamp);
For other options: http://psoug.org/reference/timestamp.html

PostgreSQL amount for each day summed up in weeks

I've been trying to find a solution to this challenge all day.
I've got a table:
id | amount | type | date | description | club_id
-------+---------+------+----------------------------+---------------------------------------+---------+--------
783 | 10000 | 5 | 2011-08-23 12:52:19.995249 | Sign on fee | 7
The table has a lot more data than this.
What I'm trying to do is get the sum of amount for each week, given a specific club_id.
The last thing I ended up with was this, but it doesn't work:
WITH RECURSIVE t AS (
SELECT EXTRACT(WEEK FROM date) AS week, amount FROM club_expenses WHERE club_id = 20 AND EXTRACT(WEEK FROM date) < 10 ORDER BY week
UNION ALL
SELECT week+1, amount FROM t WHERE week < 3
)
SELECT week, amount FROM t;
I'm not sure why it doesn't work, but it complains about the UNION ALL.
I'll be off to bed in a minute, so I won't be able to see any answers before tomorrow (sorry).
I hope I've described it adequately.
Thanks in advance!
It looks to me like you are trying to use the UNION ALL to retrieve a subset of the first part of the query. That won't work. You have two options. The first is to use user defined functions to add behavior as you need it and the second is to nest your WITH clauses. I tend to prefer the former, but you may be preferring the latter.
To do the functions/table methods approach you create a function which accepts as input a row from a table and does not hit the table directly. This provides a bunch of benefits including the ability to easily index the output. Here the function would look like:
CREATE FUNCTION week(club_expense) RETURNS int LANGUAGE SQL IMMUTABLE AS $$
select EXTRACT(WEEK FROM $1.date)
$$;
Now you have a usable macro which can be used where you would use a column. You can then:
SELECT c.week, sum(amount) FROM club_expense c
GROUP BY c.week;
Note that the c. before week is not optional. The parser converts that into week(c). If you want to limit this to a year, you can do the same with years.
This is a really neat, useful feature of Postgres.