Converting inconsistent strings to date - postgresql

I'm working with a data set that has a date column in the form of string. I thought it would a simple by doing something like this:
to_date(date_of_birth,'YYYY-01-01')
date_of_birth is in the format DD/MM/YYYY, and is of type 'text'
HOWEVER, I stumbled upon some crazy cases where you have information like
//1980
Another case was:
0/0/1980
When I run my solution, I receive the following error:
ERROR: invalid value "//19" for "YYYY"
DETAIL: Value must be an integer.
My goal actually is simply to collect the year, since that's at least consistent. How do you handle such cases with Postgres?
EDIT:
Switched it to the following:
to_date(date_of_birth,'01/01/YYYY')
My query is this:
SELECT to_date(date_of_birth,'01/01/YYYY') AS year, COUNT(*) AS yearTotal FROM student WHERE date_of_birth LIKE '%/%/1980' GROUP BY year;
The result turns out like this:
year | yeartotal
---------------+-----------
0030-01-01 | 3
0001-01-01 BC | 1

The problem through manipulating the date information (which was of type string) into 'YYYY' in order to access the proper information through date_part.
Using PostgreSQL's "right" function, I was able to collect the year from the string written in the format "DD/MM/YYYY", where right('10/12/2013',4) will return '2013'.
My query looked like this:
SELECT date_part('year', to_date(right(date_of_birth,4),'YYYY')) AS year, COUNT(*) AS total FROM student GROUP BY year
Another way I could've solved the problem was through regex by ensuring that I only work with 'valid' date statements. Something along the line of (using Python regex as an example):
regexp_matches(date_of_birth, '.\/.\/\d{4}')

try TO_CHAR instead of TO_DATE, it worked for me
SELECT replace(date_of_birth,'//','') AS year, COUNT(*) AS yearTotal FROM student GROUP BY year;

Related

postgreSQL increment number in output

I am extracting three values (server, region, max(date)) from my postgresql> But I want to extract an additional 4th field which should be the numerical addition of 1 to 3rd field. I am unable to use date add function as in the database date field is defined as an integer.
date type in DB
date|integer|not null
tried using cast and date add function
MAX(s.date)::date + cast('1 day' as interval)
Error Received
ERROR: cannot cast type integer to date
Required output
select server, region, max(alarm_date), next date from table .....
testserver, europe, 20190901, 20190902
testserver2, europe, 20191001, 20191002
next date value should be the addition to alarm_date
To convert an integer like 20190901 to a date, use something like
to_date(CAST(s.date AS text), 'YYYYMMDD')
It is a bad idea to store dates as integers like that. Using the date data type will prevent corrupted data from entering the database, and it will make all operations natural.
First solution that came to my mind:
select (20190901::varchar)::date + 1
Which output 2019-09-02 as type date.
Other solutions can be found here.

Cast varchar as date select distinct top 100

I am trying to fix a query that has come to light in SSRS after the new year. We have an input that comes from another application. It grabs a date and stores it as varchar. The SSRS report then fetches the top 100 'dates' but when 2017 dates have come around, this are not in the top 100.
The existing query is as follows
SELECT DISTINCT TOP (100)
FROM DenverTempData
ORDER by BY Date DESC
The date is stored as VARCHAR. So obviously this query doesn't grab a value such as 01012017 as being a top 100 (over values likes 12312016). I thought maybe I can simply change the datatype on this column to datetime. But the information comes from a flat file and is converted, so it's a little more difficult that that. So I'm hoping to do a select of the distinct top 100 while converting the date column to datetime or just date and grabbing the last 100 dates.
Can someone help with the query syntax? I'm thinking a cast to convert varchar to date, but how do I format with distinct top 100? I'm simply looking to retrieve the last 100 dates in chronological order from a column that is stored as varchar but contains a string representing a date.
Hopefully that makes sense
It is always a bad idea to store a date as string. This is highly culture specific!
You can cast your formatted string-date to a real date like this:
DECLARE #DateMMDDYYYY VARCHAR(100)='12312016';
SELECT CONVERT(DATE,STUFF(STUFF(#DateMMDDYYYY,5,0,'-'),3,0,'-'),110)
After the conversion your sorting (and therefore the TOP 100) should work as expected.
My strong advise: Try to store your dates in a column of a real date type to avoid such hassel!
SELECT DISTINCT TOP 100 (CAST(VarcharColumn as Date) as DateColumn)
FROM TABLE
Order by DateColumn desc

Access: grouping by month/year, how then to display the date properly

I have a query where I want to group by month and year, hence:
GROUP BY ... Month(Deductions.USDate), Year(Deductions.USDate) ...
But because of the GROUP BY, I need to use these things in the main select query, hence:
SELECT Month(Deductions.USDate)&"/"&Year(Deductions.USDate)
which gives me 8/2012, 11/2011 etc. Which is great, but it's a string, not a date, so I can't query against it properly.
I've tried doing something like the below, but am getting a type mismatch. What is wrong?
CDATE("1/"&Month(Deductions.USDate)&"/"&Year(Deductions.USDate)
I think you are looking for this:
DateSerial(Year(Deductions.USDate), Month(Deductions.USDate), 1)
This returns a date given a year and month, and the day is always set to 1.

Conversion failed when converting date and/or time from character string Error

Select CONVERT(Date, '13-5-2012')
When i run the above T-SQL statement in Management Studio, i get i get the following error:
"Conversion failed when converting date and/or time from character string"
Is there away i can cast that value to a valid Date type successfully? I have such values in a nvarchar(255) column whose dataType i want to change to Date type in an SQL Server table but i have hit that error and i would like to first do a conversion in an Update statement on the table.
Specify what date format you are using:
Select CONVERT(Date, '13-5-2012', 105)
105 means Italian date format with century (dd-mm-yyyy).
Ref: http://msdn.microsoft.com/en-us/library/ms187928.aspx
In general, I'd suspect usually there is data which can't be converted in a column, and would use a case statement checking it's convertable first:
SELECT CASE WHEN ISDATE(mycolumn)=1 THEN CONVERT(Date, mycolumn, [style]) END
FROM mytable
I believe Convert relies on the SQL Server date format setting. Please check your dateformat setting with DBCC USEROPTIONS.
I suspect if you set the dateformat to dmy it'll understand:
SET DATEFORMAT dmy
GO
If even then it doesn't work, you can't find a style that matches your data, and if your data is in a consistant format, it's down to manual string manipulation to build it (don't do this if you can help it).
Try this....
Select CONVERT(Date,'5-13-2012')
Use 'mm-dd-yyyy' format.
CONVERT assumes that the original data can represent a date. One bad data item can throw the same conversion error mentioned here without pointing to the problem.
Using ISDATE helped me get around the bad data items.
SELECT CONVERT(DATE, CONVERT(CHAR(8), FieldName))
FROM DBName
WHERE ISDATE(FieldName) <> 0
You need to give the date format while conversion, this will resolve the error.
select convert(date, '13-5-2012' ,103)

How to convert d/MM/yyyy data to dd/MM/yyyy in sql server table?

I have create one field in sql server database as nvarchar datatype and store some date like 'd/MM/yyyy' and 'dd/MM/yyyy' format previously. Now i want to get all data in 'dd/MM/yyyy' format using query it is possible?
You can cast the field to datetime in the query:
select cast(YourField as datetime)
from YourTable
where isdate(YourField) = 1
The where isdate(YourField) = 1 part is necessary to filter out rows where the value is no valid date (it's a nvarchar field, so there could be things like abc in some rows!)
But you should really change the field to datetime in the long term, as already suggested by Christopher in his comment.
Casting like described above is always error-prone because of the many different data formats in different countries.
For example, I live in Germany where the official date format is dd.mm.yyyy.
So today (December 9th) is 9.12.2011, and running select cast('9.12.2011' as datetime) on my machine returns the correct datetime value.
Another common format is mm/dd/yyyy, so December 9th would be 12/9/2011.
Now imagine I have a nvarchar field with a date in this format on my German machine:
select cast('12/9/2011' as datetime) will return September 12th (instead of December 9th)!
Issues like this can easily be avoided by using the proper type for the column, in this case datetime.