Apache PIG - Get only date from TimeStamp - date

I've the following code:
Data = load '/user/cloudera/' using PigStorage('\t')
as
( ID:chararray,
Time_Interval:chararray,
Code:chararray);
transf = foreach Source_Data generate (int) ID,
ToString( ToDate((long) Time_Interval), 'yyyy-MM-dd hh:ss:mm') as TimeStamp,
(int) Code;
SPLIT transf INTO Src25 IF (ToString(TimeStamp, 'yyyy-MM-dd')=='2016-07-25'),
Src26 IF (ToString(TimeStamp, 'yyyy-MM-dd')=='2016-07-26');
STORE Src25 INTO '/user/cloudera/2016-07-25' using PigStorage('\t');
STORE Src26 INTO '/user/cloudera/2016-07-26' using PigStorage('\t');
I want to split the files by date and the rules that I'm putting in Split statement it gives me error...
How can I transform TimeStamp (used on transf statement) in Date to make the comparasion?
Many thanks!

After you get the datetime object from ToDate, use GetYear(),GetMonth(),GetDay() on the datetime object and use CONCAT to construct only the date.
transf = foreach Source_Data generate
(int) ID,
ToString( ToDate((long) Time_Interval), 'yyyy-MM-dd hh:ss:mm') as TimeStamp,
(int) Code;
transf_new = foreach transf generate
ID,
TimeStamp,
CONCAT(CONCAT(CONCAT(GetYear(TimeStamp),'-')),(CONCAT(GetMonth(TimeStamp),'-')),GetDay(TimeStamp)) AS Day,-- Note:Brackets might be slightly off but it should be like 'yyyy-MM-dd' format
Code;
-- Now use the new Day column to split the data
SPLIT transf_new INTO Src25 IF (Day =='2016-07-25'),
Src26 IF (Day =='2016-07-26');

Related

Splunk: Extract string and convert it to date format

I have such events:
something;<id>abc123<timeStamp>2021-12-10T23:10:12.044Z<timeStamp>2021-12-10T23:08:55.278Z>
I want to extract the Id abc123 and the two timeStamps.
index = something
|rex field=_raw "id>(?<Id>[0-9a-z-]+)"
|rex "timeStamp>(?<timeStamp>[T0-9-\.:Z]+)"
| table _time Id timeStamp
This works with the query above. But what I struggle now is to convert the timeStamp-string to date format to get at the end the min(timeStamp) extracted in order to compute the difference between the event's _time and the min(timeStamp) by the id field. I am struggling because of the special format of the timestamp with T and Z included in it.
There's nothing special about those timestamps - they're in standard form. Use the strptime function to convert them.
index = something
|rex field=_raw "id>(?<Id>[^\<]+)"
|rex "timeStamp>(?<timeStamp>[^\<]+)"
| eval ts = strptime(timeStamp, "%Y-%m-%dT%H:%M:%S.%3N%Z")
| eval diff = ts - _time
| table _time Id timeStamp diff
Check out strftime.org, and the related strptime function used with eval
Something on the order of this (pulled the microseconds out of your rex, since Unix epoch time has no concept of subsecond intervals):
| rex field=_raw "timeStamp\>(?<timeStamp>[^\.]+)\.\d+Z"
| eval unixepoch=strptime(timeStamp,"%Y-%m-%dT%H:%M:%S")

I want to compare from date and to date with date order using search orm in odoo

I want to compare from_date and to_date with date_order using search orm in odoo.
I just want to extract date alone because in date_order it is given date with date time. how to work with it ?
here is my code :
from_date = fields.Date(string="From", default="Today")
to_date = fields.Date(string="To")
def update_commission(self):
sale_br = self.env['sale.order']
sale_sr = sale_br.search([('date_order', '=', self.from_date)])
Modify your function like this:
def update_commission(self):
sale_br = self.env['sale.order']
sale_sr = sale_br.search([]).filtered(lambda sale: sale.date_order.date < self.from_date)
If you are working in odoo's past version like 10,11 than you need to convert this datetime into datetime object because when you call this field it will return date in string format so you need to do fields.Datetime.from_string(sale.date_order.date)

Create an inserted_at datetime where filter using simple date string

I'm trying to get records inserted after a certain date given to me by the client.
2018-06-06
Here's how I'm writing the query:
{:ok, date} = NaiveDateTime.from_iso8601(date_string)
from(
m in query,
where: m.inserted_at > ^date
)
(MatchError) no match of right hand side value: {:error, :invalid_format}
And when I try to use a simple Date object:
** (Ecto.Query.CastError) lib/messages/search.ex:77: value ~D[2018-06-06] in where cannot be cast to type :naive_datetime in query
How can I find all messages inserted after that dummy string date the client is passing me?
You have an ISO 8601 date there, not a datetime. You can convert it into a NaiveDateTime (with hour, minute, second all set to 0) like this:
iex(1)> date_string = "2018-06-06"
"2018-06-06"
iex(2)> ndt = NaiveDateTime.from_iso8601!(date_string <> " 00:00:00")
~N[2018-06-06 00:00:00]
Now you can use ndt in your query and it will work.

Convert packed DB2 iseries value to YYYY-MM-DD

I'm trying to select records from a DB2 Iseries system where the date field is greater than the first of this year.
However, the date fields I'm selecting from are actually PACKED fields, not true dates.
I'm trying to convert them to YYYY-MM-DD format and get everything greater than '2018-01-01' but no matter what I try it says it's invalid.
Currently trying this:
SELECT *
FROM table1
WHERE val = 145
AND to_date(char(dateShp), 'YYYY-MM-DD') >= '2018-01-01';
it says expression not valid using format string specified.
Any ideas?
char(dateshp) is going to return a string like '20180319'
So your format string should not include the dashes.. 'YYYYMMDD'
example:
select to_date(char(20180101), 'YYYYMMDD')
from sysibm.sysdummy1;
So your code should be
SELECT *
FROM table1
WHERE val = 145
AND to_date(char(dateShp), 'YYYYMMDD') >= '2018-01-01';
Charles gave you a solution that converts the Packed date to a date field, and if you are comparing to another date field, this is a good solution. But if you are comparing to a constant value or another numeric field, you could just use something like this:
select *
from table1
where val = 145
and dateShp >= 20180101;

HIVE - group by date function

Can anyone tell me why I'm not getting counts for each f0, MONTH, DAY, HOUR, MINUTE group in my result set?
Result set:
Query:
SELECT t.f0, MONTH(TO_DATE(Hex2Dec(t.f2))), DAY(TO_DATE(Hex2Dec(t.f2))), HOUR(TO_DATE(Hex2Dec(t.f2))), MINUTE(TO_DATE(Hex2Dec(t.f2))), COUNT(DISTINCT t.f1)
FROM table t
WHERE (t.f0 = 1 OR t.f0 = 2)
AND (t.f3 >= '2013-02-06' AND t.f3 < '2013-02-15')
AND (Hex2Dec(t.f2) >= 1360195200 AND Hex2Dec(t.f2) < 1360800000)
AND *EXTRA CONDITIONS*
GROUP BY t.f0, MONTH(TO_DATE(Hex2Dec(t.f2))), DAY(TO_DATE(Hex2Dec(t.f2))), HOUR(TO_DATE(Hex2Dec(t.f2))), MINUTE(TO_DATE(Hex2Dec(t.f2)))
Schema:
f0 INT (Partition Column)
f1 INT
f2 STRING
f3 STRING (Partition Column)
f4 STRING
f5 STRING
f6 STRING
f7 MAP<STRING,STRING>
*f2 is a unix timestamp in Hexadecimal format
This might be because to_date returns null when it's applied on a unix time.
According to the Hive manual:
to_date(string timestamp): Returns the date part of a timestamp
string: to_date("1970-01-01 00:00:00") = "1970-01-01"
Use from_unixtime instead to get back the correct date parts.
Note:
I assume Hex2Dec UDF is taken from the core library of HIVE-1545