how to converting string date in 'yyyy-m-dd' to 'yyyy-mm-dd' in Hive query? - date

I searched up and down but couldn't find anything that works.
I have a date that is stored as a string in this format: '2021-9-01' so there are no leading zeros in the month column. This is an issue when trying to select a max date as it interprets September to be greater than October.
Any time I run something that tried to convert this it literally never finishes. I can pull back 1 row when selecting * from... but this fails to complete:
select unix_timestamp(bad_date, 'yyyy-m-dd') from mytable
I'm using hive query so not sure how to make this conversion work so I can actually get October (this month) to show up as the max date?

Correct pattern for month is MM. mm is minutes.
from_unixtime(unix_timestamp(bad_date, 'yyyy-M-dd'),'yyyy-MM-dd')
One more method is to split and concatenate with lpad:
select concat_ws('-',splitted[0], lpad(splitted[1],2,0),splitted[2])
from
(
select split('2021-9-01','-') splitted
)s
Result:
2021-09-01

Related

Can someone help me convert this sqlite query to a postgres query?

SELECT substr(strftime('%Y', date),3,2) AS month, 0 AS zero FROM listvalue
I have this query in SQLite, when I import it into Postgres I'm having problem translating the substr(strftime('%Y', date),3,2) part.
substr(strftime('%Y', date),3,2) extracts the last 2 digits of the year part of the column date, but in your code you alias it as month!
if you want to do the same in Postgresql you can use extract() to get the year, typecast it to varchar and use substr() to get the last 2 chars:
substr(extract(year from date)::varchar, 3, 2)
You can translate the expression with TO_CHAR function to PostgreSQL:
SUBSTR(TO_CHAR(date, 'YYYY'), 3, 2)
Note: The format %Y will return the year of the date and not the month as your query suggests.
The SQLite query will return the last to digits of year of the date. You can omit the SUBSTR in PostgreSQL call by using the apropriate format:
TO_CHAR(date, 'YY')
We suspect that you want to extract the last two digits of the year value from the date column. The strftime() function won’t work in Postgres server. So, suggest you to use either one of the below queries to achieve your requirement,
SELECT substring(date_part('year', date)::varchar, 3, 2) AS year, 0 AS zero FROM listvalue;
SELCT to_char(shipped_date, 'YY'), 0 AS zero from listvalue;
Thanks,
Renuka N.

Extract highest date per month from a list of dates

I have a date column which I am trying to query to return only the largest date per month.
What I currently have, albeit very simple, returns 99% of what I am looking for. For example, If I list the column in ascending order the first entry is 2016-10-17 and ranges up to 2017-10-06.
A point to note is that the last day of every month may not be present in the data, so I'm really just looking to pull back whatever is the "largest" date present for any existing month.
The query I'm running at the moment looks like
SELECT MAX(date_col)
FROM schema_name.table_name
WHERE <condition1>
AND <condition2>
GROUP BY EXTRACT (MONTH FROM date_col)
ORDER BY max;
This does actually return most of what I'm looking for - what I'm actually getting back is
"2016-11-30"
"2016-12-30"
"2017-01-31"
"2017-02-28"
"2017-03-31"
"2017-04-28"
"2017-05-31"
"2017-06-30"
"2017-07-31"
"2017-08-31"
"2017-09-29"
"2017-10-06"
which are indeed the maximal values present for every month in the column. However, the result set doesn't seem to include the maximum date value from October 2016 (The first months worth of data in the column). There are multiple values in the column for that month, ranging up to 2016-10-31.
If anyone could point out why the max value for this month isn't being returned, I'd much appreciate it.
You are grouping by month (1 to 12) rather than by month and year. Since 2017-10-06 is greater than any day in October 2016, that's what you get for the "October" group.
You should
GROUP BY date_trunc('month', date_col)

how to format date column in kdb

I am newbie to KDB. I have a KDB table which I am querying as:
select[100] from table_name
now this table has got some date columns which have dates stored in this format
yyyy.mm.dd
I wish to query that table and retrieve the date fields in specific format (like mm/dd/yyyy). If this would've been any other RDBMS table this is what i would have done:
select to_date(date_field,'mm/dd/yyyy') from table_name
I need kdb equivalent of above. I've tried my best to go through the kdb docs but unable to find any function / example / syntax to do that.
Thanks in advance!
As Anton said KDB doesn't have an inbuilt way to specify the date format. However you can extract the components of the date individually and rearrange as you wish.
For the example table t with date column:
q)t
date
----------
2008.02.04
2015.01.02
q)update o:{"0"^"/"sv'flip -2 -2 4$'string`mm`dd`year$\:x}date from t
date o
-----------------------
2008.02.04 "02/04/2008"
2015.01.02 "01/02/2015"
From right to left inside the function: we extract the month,day and year components with `mm`dd`year$:x before stringing the result. We then pad the month and day components with a null character (-2 -2 4$') before each and add the "/" formatting ("/"sv'flip). Finally the leading nulls are filled with "0" ("0"^).
Check out this GitHub library for datetime formatting. It supports the excel way of formatting date and time. Though it might not be the right fit for formatting a very large number of objects (but if distinct dates are very less then a keyed table and lj can be used for lookup).
q).dtf.format["mm/dd/yyyy"; 2016.09.23]
"09/23/2016"
q).dtf.format["dd mmmm yyyy"; 2016.09.03] // another example
"03 September 2016"
I don't think KDB has built-in date formatting features.
The most reliable way is to format date by yourself.
For example
t: ([]date: 10?.z.d);
update dateFormatted: {x: "." vs x; x[1],"/",x[2],"/",x[0]} each string date from t
gives
date dateFormatted
------------------------
2012.07.21 "07/21/2012"
2001.05.11 "05/11/2001"
2008.04.25 "04/25/2008"
....
Or, more efficient way to do the same formatting is
update dateFormatted: "/"sv/:("."vs/:string date)[;1 2 0] from t
now qdate is available for datetime parsing and conversion

How to compare dates or date against today with query on google sheets?

I'm working on making a replica of sheet1 on to another sheet2 (same document), and query() worked fine until the column i want to filter are formula cells (LONG ones each with query, match, etc).
What i want to do is filter the rows in sheet1 where the event date in column M is upcoming (there are more filter conditions but just to simplify this is the main problem).
I don't want the rows where the date is either empty, in the past (various date formats), or where the formula give a result of empty string "".
The formulas i've tried (which gives error) - note i'm just selecting 2 columns for testing:
=query(sheet1!A3:N, " select I,M where I = 'Singapore' AND DATEVALUE(M)>TODAY() ",0)
=query(sheet1!A3:N, " select I,M where I = 'Singapore' AND M>TODAY() ",0)
This formula doesn't give error but doesnt show correct data - shows all data except Jan 2017 - August 7 2017:
=FILTER(sheet1!A3:N, sheet1!I3:I="Singapore", sheet1!M3:M>TODAY())
This formula gives empty output:
=query(sheet1!A3:N, " select I,M where I = 'Singapore' AND M='22 August' ",0)
There's no today() in Query. Use now() instead:
=query(sheet1!A3:N, " select I,M where I = 'Singapore' AND M > now() ",0)
Or if you want now() without time(equivalent to TODAY()), use:
todate(now())
For this to work, provided you have all the correct dates in M in any format, which Google sheets recognises (i.e., the formula bar shows the correct date or time) regardless of the actual string. If not, You should manually convert all those remaining dates to correct format. The correct format of date to be entered in the query formula is date 'yyyy-mm-dd'. It doesn't matter what format the date is in sheets ( as long as Google sheets recognises this), but the actual formula must only contain date in this format for comparison. For example , to find all dates less than 31,August, 2017,
=query(A2:B6, "select A where A < date '2017-08-31'")
You can use this to figure out all the dates, which Google doesn't recognise: N1:
M:M*1
If you get an error in the helper column N, then those dates are not recognised. Even if you did not get error, it is possible that Google sheets mis-recognizes the date. There is also a more specific function:
=ARRAYFORMULA(ISDATE(M:M))
References:
Scalar functions

Cast varchar as date select distinct top 100

I am trying to fix a query that has come to light in SSRS after the new year. We have an input that comes from another application. It grabs a date and stores it as varchar. The SSRS report then fetches the top 100 'dates' but when 2017 dates have come around, this are not in the top 100.
The existing query is as follows
SELECT DISTINCT TOP (100)
FROM DenverTempData
ORDER by BY Date DESC
The date is stored as VARCHAR. So obviously this query doesn't grab a value such as 01012017 as being a top 100 (over values likes 12312016). I thought maybe I can simply change the datatype on this column to datetime. But the information comes from a flat file and is converted, so it's a little more difficult that that. So I'm hoping to do a select of the distinct top 100 while converting the date column to datetime or just date and grabbing the last 100 dates.
Can someone help with the query syntax? I'm thinking a cast to convert varchar to date, but how do I format with distinct top 100? I'm simply looking to retrieve the last 100 dates in chronological order from a column that is stored as varchar but contains a string representing a date.
Hopefully that makes sense
It is always a bad idea to store a date as string. This is highly culture specific!
You can cast your formatted string-date to a real date like this:
DECLARE #DateMMDDYYYY VARCHAR(100)='12312016';
SELECT CONVERT(DATE,STUFF(STUFF(#DateMMDDYYYY,5,0,'-'),3,0,'-'),110)
After the conversion your sorting (and therefore the TOP 100) should work as expected.
My strong advise: Try to store your dates in a column of a real date type to avoid such hassel!
SELECT DISTINCT TOP 100 (CAST(VarcharColumn as Date) as DateColumn)
FROM TABLE
Order by DateColumn desc