postgreSQL sorting with timestamps - postgresql

I have the following SQL statement:
SELECT * FROM schema."table"
WHERE "TimeStamp"::timestamp >= '2016-03-09 03:00:05'
ORDER BY "TimeStamp"::date asc
LIMIT 15
What do I expect it to do? Giving out 15 rows of the table, where the timestamp is the same and bigger than that date, in ascending order. But postgres sends the rows in the wrong order. The first item is on the last position.
So has anyone an idea why the result is this strange?

Use simply ORDER BY "TimeStamp" (without casting to date).

By casting "TimeStamp" to date you throw away the time part of the timestamp, so all values within one day will be considered equal and are returned in random order. It is by accident that the first rows appear in the order you desire.
Don't cast to date in the ORDER BY clause if the time part is relevant for sorting.
Perhaps you are confused because Oracle's DATE type has a time part, which PostgreSQL's doesn't.

Related

How can I sum a column by date in KDB Q?

I have a table in KDB with columns date, time, and returns. I need the cumulative returns per day stowed in a new column, but I'm not sure how to separate the sums function by date. This is what I tried:
Table: update cumreturns: sums returns from Table by date where time within (09:30:00;16:00:00)
But I get an evaluation error: date when I run it. Is my syntax off or is there another way I need to approach this?
you need to include the by clause before the 'from' statement, something like this
update cumReturns:sums returns by date from t

Postgres Date Compare with ISO timestamp

When I perform a > than query on a timestampz field it seems to include dates that are equal to the date I'm querying with. At least when I'm comparing to an ISO date string?
select
created,
to_char(created, 'MI:SS:MS')
from
private.event
where
created > '2020-03-24T05:14:08.082Z'
Results
created |to_char |
-------------------|---------|
2020-03-24 18:14:08|14:08:082|
2020-03-24 18:14:08|14:08:180|
I'm not expecting the first row in that result.
FYI if I adjust the query so that I compare with '2020-03-24T05:14:08.083Z' it goes away.
Does anyone know whats going on here ?
Postgres timestamps have microsecond resolution even if they're displayed with millisecond resolution. So you're effectively searching for >'2020-03-24T05:14:08.082000Z' while that first result is probably non-zero in one or more of those last three hidden digits.

Cast varchar as date select distinct top 100

I am trying to fix a query that has come to light in SSRS after the new year. We have an input that comes from another application. It grabs a date and stores it as varchar. The SSRS report then fetches the top 100 'dates' but when 2017 dates have come around, this are not in the top 100.
The existing query is as follows
SELECT DISTINCT TOP (100)
FROM DenverTempData
ORDER by BY Date DESC
The date is stored as VARCHAR. So obviously this query doesn't grab a value such as 01012017 as being a top 100 (over values likes 12312016). I thought maybe I can simply change the datatype on this column to datetime. But the information comes from a flat file and is converted, so it's a little more difficult that that. So I'm hoping to do a select of the distinct top 100 while converting the date column to datetime or just date and grabbing the last 100 dates.
Can someone help with the query syntax? I'm thinking a cast to convert varchar to date, but how do I format with distinct top 100? I'm simply looking to retrieve the last 100 dates in chronological order from a column that is stored as varchar but contains a string representing a date.
Hopefully that makes sense
It is always a bad idea to store a date as string. This is highly culture specific!
You can cast your formatted string-date to a real date like this:
DECLARE #DateMMDDYYYY VARCHAR(100)='12312016';
SELECT CONVERT(DATE,STUFF(STUFF(#DateMMDDYYYY,5,0,'-'),3,0,'-'),110)
After the conversion your sorting (and therefore the TOP 100) should work as expected.
My strong advise: Try to store your dates in a column of a real date type to avoid such hassel!
SELECT DISTINCT TOP 100 (CAST(VarcharColumn as Date) as DateColumn)
FROM TABLE
Order by DateColumn desc

Get each value's difference from overall min value in Postgres

I have a table of data that is in timestamp with time zone format (called "time"). I also have an empty table that takes in interval data type values. For each row in the empty table, I want to insert the interval difference between that row's timestamp in the original data and the overall minimum timestamp value in the original data. I'm trying to do something like this:
INSERT INTO
time_pyramid
SELECT
"time" - MIN("time")
FROM
time_raw;
But it tells me "ERROR: column "time_raw.time" must appear in the GROUP BY clause or be used in an aggregate function". I know I want each timestamp value's interval difference from the table's overall minimum timestamp value, and "time" is not going to end up having duplicate values from this interval conversion, so I don't really think I should use GROUP BY in that context. I also see no reason to use an aggregation function on the first "time", so how can I fix my query to reflect what I want?
Edit: Actually, "Get each value as its interval difference from the min" is a better title for this question
Use min() as a window function:
with time_raw("time") as (
values
('2016-01-11'::timestamp),
('2016-01-01'::timestamp),
('2016-01-21'::timestamp)
)
select
"time"- min("time") over () as interval
from
time_raw;
interval
----------
10 days
00:00:00
20 days
(3 rows)

Aggregate by most recent not-null value

I have a dataset with the following columns [ product_id, country_id, date, number_of_installs, cumulative_installs_last_30_days ]
I have no problem applying the standard measures to find the sum, max or average number_of_installs within those three dimensions (product_id, country_id, date(aggregated by month or week)). However, I have not been able to aggregate by cumulative_installs_last_30_days because as that variable is already a cumulative, I need to return the “most recent value” and Tableau does not have that option built-in the aggregation functions.
How do I create a Calculated Field that enables an addicional column in the aggregated dataset with the most recent not-null value of cumulativeInstalls_last_30_days within the dimensions product_id, country_id and date(aggregated by month or week)?
Here's a dirty solution.
In the comments, you noted that you wanted that 30 days to be dynamic, so to accomplish that, create a parameter, make it an integer, select Range, and allow any integer over zero. I'll call it [Number of Days].
Then create a calculated field:
TOTAL(SUM(IIF(DATEDIFF("day", [date], TODAY()) < [Number of Days], [Number of Installs], NULL)))
I know that's ridonk, so I'll break it down, from the inside out.
DATEDIFF("day", [date], TODAY())
That just calculates the difference in days between today and the date in a given row.
IIF(DATEDIFF("day", [date], TODAY()) < [Number of Days], [Number of Installs], NULL)
That checks if that difference is less than the number of days you selected. If it is, this statement is equal to the number of installs. If it's not, it's null. As a result, if we sum all of these values, we only get the number of installs in the last [Number of Days] days.
With that in mind, we SUM() the rows. TOTAL() just performs that sum over every database row that contributes to the partition.
Note that if your database has dates after TODAY(), you'll need to add another condition to that IIF() statement to make sure those aren't included.
You also mentioned that you want to be able to aggregate the number of installs by month. That's MUCH easier. Just toss MONTH([date]) into the dashboard, then SUM([Number of Installs]), and Tableau will knock it out for you.