How to implement such query
SELECT DISTINCT ON (user_id) user_id, date FROM mayak_position WHERE user_id IN ('1111', '2222') ORDER BY user_id, date DESC;
with minimal raw sql, using django-orm
I tried many different queries, without success
Position.objects.annotate(s=Count('user_id', distinct=True)).order_by('s')
Position.objects.filter(user_id__in=child_ids).order_by('user_id', '-date').distinct('user_id')
source
result = Position.objects.filter(user_id__in=['1111', '2222']).values('user_id','date').distinct()
Related
I am using a custom query like this-
select * from my_table where id not in (select id from my_table where order_date <= '2022-01-01' and delivery_status in ('Pending'));
I am using it because i have to exclude those orders from my table where delivery_status is pending before the start_date and this date can be a random date which i want to add filter to in the quicksight .
But i am getting a wrong data when i add filter for this column in the analysis . it gives the result without considering the inner query in the analysis.
Any suggestions on whether i need to change the query or regarding analysis will be much appreciated .
You don't need a subquery. Your query says "get stuff where id not in a list where the status is pending.
Try.
select * from my_table where order_date <= '2022-01-01' and delivery_status = 'Pending'
So I am trying and struggling for few days to extend the schema with the custom groupby using something like this
I have a table with few fields like id, country, ip, created_at.
Then I am trying to get them as groups. For example, group the data based on date, hourly of date, or based on country, and based on country with DISTINCT ip.
I am zero with SQLs honestly. But I tried to play around and get what I want. Here's an example.
SELECT Hour(created_at) AS date,
COUNT(*) AS count
FROM session where CAST(created_at AS date) = '2021-04-05'
GROUP BY Hour(created_at)
ORDER BY date;
SELECT country,
count(*) AS count from (SELECT * FROM session where CAST(created_at AS date) <= '2021-05-12' GROUP BY created_at) AS T1
GROUP BY country;
SELECT country, COUNT(*) as count
FROM (SELECT DISTINCT ip, country FROM session) AS T1
GROUP BY country;
SELECT DATE(created_at) AS date,
COUNT(*) AS count
FROM session
GROUP BY DATE(created_at)
ORDER BY date;
Now I am struggling with two things.
How do I make the date as variables? I mean, if I want to group them for a particular date range/ or today's data hourly, or per quarter gap (more of configurable), how do I add the variables in Hasura's Raw SQL?
Also for this approach I have to add schema for each one of them? Like this
CREATE
OR REPLACE VIEW "public"."unique_session_counts_date" AS
SELECT
date(session.created_at) AS date,
count(*) AS count
FROM
session
GROUP BY
(date(session.created_at))
ORDER BY
(date(session.created_at));
Is there a way to make it more generalized? What I mean is, if it
was in Nodejs I could have done something like
return rawQuery(
`
select ${field} x, count(*) y
from ${table}
where website_id=$1
and created_at between $2 and $3
${domainFilter}
${urlFilter}
group by 1
order by 2 desc
`,
params,
);
In this case, based on whatever field and where clause I send, one query would do the trick for me. Can do something similar in hasura?
Thank you so much in advance.
How do I make the date as variables? I mean, if I want to group them for a particular date range/ or today's data hourly, or per quarter gap (more of configurable), how do I add the variables in Hasura's Raw SQL?
My first thought is this. If you're thinking about passing in variables via a GraphQL for example, the GraphQL would look something like:
query MyQuery {
unique_session_counts_date(where: {created_at: {_gte: "<start date here>", _lte: "<end date here>"}}) {
<...any fields, rollups, etc here...>
}
}
The underlying view/query would follow the group by and order by that you've detailed. Then you'd be able to submit a query of the graphql query and just pass in the pertinent parameters like the $1, $2, and $3 in the raqQuery call.
Also for this approach I have to add schema for each one of them?
The schema? The view? I don't think a view specifically would be required, if a multilevel select or similar query can handle it and perform then a view wouldn't particularly be needed.
That's my first stab at the problem. I'm going to try to work through this problem in a few hours via a Twitch stream # HasuraHQ if you can join, happy to walk through it live.
I have a list of dates, accounts, and sources of data. I'm taking the latest max date for each account and using that number in my window reference.
In my window reference, I'm using row_number () to assign unique rows to each account and sources of data that we're receiving and sorting it by the max date for each account and source of data. The end result should list out one row for each unique account + source of data combination, with the max date available in that combination. The record with the highest date will have 1 listed.
I'm trying to set a condition on my window function where only rows that populate with 1 are listed in the query, while the other ones are not shown at all. This is what I have below and where I get stuck:
SELECT
date,
account,
data source,
MAX(date) max_date,
ROW_NUMBER () OVER (PARTITION BY account ORDER BY max_date) ROWNUM
FROM table
GROUP BY
date,
account,
data source
Any help is greatly appreciated. I can elaborate on anything if necessary
If I understood your question correctly this SQL would do the trick
SELECT
date,
account,
data source,
MAX(date) max_date
FROM (
SELECT
date,
account,
data source,
MAX(date) max_date,
ROW_NUMBER () OVER (PARTITION BY account ORDER BY max_date) ROWNUM
FROM table
GROUP BY
date,
account,
data source
)
where ROWNUM = 1
If you do not need the row number for anything other than uniqueness then a query like this should work:
select distinct t.account, data_source, date
from table t
join (select account, max(date) max_date from table group by account) m
on t.account=m.account and t.date=m.max_date
This can still generate two records for one account if two records for different data sources have the identical date. If that is a possibility then mdem7's approach is probably best.
It's a bit unclear from the question but if you want each combination of account and data_source with its max date making sure there are no duplicates, then distinct should be enough:
select distinct account, data_source, max(date) max_date
from table t
group by account, data_source
can someone help me to write a query?
I have for example columns:
Date
product_key
category_code
In one day I expect to have same category_code for one product, but I want to check this with SQL.
Thank you.
If you want to find the day, the product_key and the category_code that doubles, You should use query like this:
SELECT
date,
product_key,
category_code,
count(1)
FROM your_table
GROUP BY date, product_key, category_code
HAVING count(1) > 1;
You can group your results by date and product, and use count and distinct to find if there is more than one category code for a product. You can then filter rows having more than 1 distinct category in the group.
SELECT
Date, product_key, count(distinct category_code) AS categories
FROM
my_table
GROUP BY
Date, product_key
HAVING
count(distinct category_code) > 1
I've come across a vexing problem with a PostgreSQL query. This works in my local development environment:
SELECT distinct (user_id) user_id, created_at, is_goodday
FROM table
WHERE ((created_at >= '2011-07-01 00:00:00') AND user_id = 95
AND (created_at < '2011-08-01 00:00:00'))
ORDER BY user_id, created_at ASC;
...but gives the following error on my QA server (which is on Heroku):
PGError: ERROR: syntax error at or near "user_id"
LINE 1: SELECT distinct (user_id) user_id, created_at,
^
Why could this be?
Other possibly relevant info:
I have tried single-quoting and double-quoting the field names
It's a Rails 3 app, but I'm using this SQL raw, i.e. no ActiveRecord magic
My local version of Postgres is 9.0.4 on Mac, but I have no idea what version Heroku is using
As per your comment, the standard PostgreSQL version of that query would be:
SELECT user_id, created_at, is_goodday
FROM table
WHERE created_at >= '2011-07-01 00:00:00'
AND created_at < '2011-08-01 00:00:00'
AND user_id = 95
ORDER BY created_at DESC, id DESC
LIMIT 1
You don't need user_id in the ORDER BY because you have user_id = 95, you want created_at DESC in the ORDER BY to put the most recent created_at at the top; then you LIMIT 1 to slice off just the first row in the result set. GROUP BY can be used to enforce uniqueness or if you need to group things for an aggregate function but you don't need it for either one of those here as you can get uniqueness through ORDER BY and LIMIT and you can hide your aggregation inside the ORDER BY (i.e. you don't need MAX because ORDER BY does that for you).
Since you have user_id = 95 in your WHERE, you don't need user_id in the SELECT but you can leave it in if that makes it easier for you in Ruby-land.
It is possible that you could have multiple entries with the same created_at so I added an id DESC to the ORDER BY to force PostgreSQL to choose the one with the highest id. There's nothing wrong with being paranoid when they really are out to get you and bugs definitely are out to get you.
Also, you want DESC in your ORDER BY to get the highest values at the top, ASC puts the lowest values at the top. The more recent timestamps will be the higher ones.
In general, the GROUP BY and SELECT have to match up because:
When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions, since there would be more than one possible value to return for an ungrouped column.
But that doesn't matter here because you don't need a GROUP BY at all. I linked to the 8.3 version of the documentation to match the PostgreSQL version you're using.
There are probably various other ways to do this but this one as probably as straight forward and clear as you're going to get.
put a quote in user_id like user_id = '95'. Your query should be
SELECT distinct (user_id) as uid, created_at, is_goodday FROM table WHERE
((created_at >= '2011-07-01 00:00:00') AND user_id = '95' AND (created_at < '2011-08-01 00:00:00')) ORDER BY user_id, created_at ASC;
You're using DISTINCT ON (without writing the ON). Perhaps you should write the ON. Perhaps your postgres server dates from before the feature was implemented (which is pretty old by now).
If all else fails, you can always do that with some GROUP BY...