Error While running query to find count difference - amazon-redshift

I want to find the difference between 2 counts from same view i.e -
Difference between Total count and Distinct count from the same VIEW view_z_a_base.
FYI - The view structure is -
TABLE z_a is created from the VIEW view_z_a which is created from VIEW view_z_a_base.
Query Being Executed -
SELECT COUNT(*) -(SELECT COUNT(*)FROM (SELECT DISTINCT * FROM
<schema>.vw_z_a_base))AS no_of_duplicates
FROM <schema>.vw_z_a_base;
EXPECTED RESULT -
Difference of the 2 counts -
?column?
0
ACTUAL RESULT -
ERROR MESSAGE _
An error occurred when executing the SQL command:
SELECT COUNT(*) -(SELECT COUNT(*)
FROM (SELECT DISTINCT * FROM <schema>.vw_z_a_base))
FROM <schema>.vw_z_a_base
[Amazon](500310) Invalid operation: Not implemented
Details:
-----------------------------------------------
error: Not implemented
code: 1001
context: 'value' - Ill formed PARAM_EXEC in expression
query: 41412011
location: pg_utils.cpp:1710
process: padbmaster [pid=22726]
-----------------------------------------------;
1 statement failed.
The same query works for Tables but not for views, why is this happening ?

I ran into a similar error and it ended up having to do largely with typecasting in my case. Included below b/c theres literally three google results for this error so hopefully it helps someone. In your case I would suspect that it has to do with the SELECT COUNT(*)FROM (SELECT DISTINCT * FROM <schema>.vw_z_a_base) element. I think you may be taking unnecessary steps here, which could be returning something that SQL is having a hard time performing subtraction with. You should be able to eliminate the nested select here, which also makes things easier to read. Try:
SELECT COUNT(*) - COUNT(DISTINT *) as no_of_duplicates from <schema>.vw_z_a_base
Returning to my case, my code went from:
case when coalesce(allup_charges,0) > 156.04 then 1
to:
case when coalesce(allup_charges::numeric,0)>156.04 then 1
This resolved my issue. I also found that mysqlworkbench was much less fussy with typecasting here than tableau. If you continue to run into the issue just over-typecast and you should be fine?

Related

What does CASE WHEN EXISTS (select * from table) do? Is there a better way to write this?

I need to resolve an issue with performance of the following query (not my code). It's a section of a larger stored procedure. This particular part is taking a very long time to execute (I just cancelled it after 20 mins) and I don't understand the intent of the case statement in the code. My research to date has only shown examples of checking for a particular column using schema tables, or checking if any records exists using select 1...
This is doing select *. Can someone help me to understand the case when exists subquery part of this query and if there is a better (more efficient in terms of time to execute) way to write this?
INSERT INTO [PaymentStage06]
SELECT
CASE
WHEN EXISTS (SELECT *
FROM [transaction] AS [MPT])
THEN 'Y'
ELSE 'N'
END AS [DuplicateReleased],
[MPT].[participantid] AS [ParticipantID],
[MPT].[courseid] AS [CourseID],
[MPT].[session] AS [SessionID],
[MPT].[feedbackcompletedid] AS [TransactionID],
GETDATE() AS [Refreshed]
FROM
[transaction] AS [MPT]
Many thanks.
Andrew
I've looked for examples of "case when exists(select *" but haven't found anything concrete. My guess is it's either checking for the existence of ANY record in that table (in which case why not just select 1), OR it's checking for a match based on all columns, but both of those seem pointless when the outer query is working on the same table. I'm obviously missing the point :)
This is the estimated execution plan https://www.brentozar.com/pastetheplan/?id=H1VMU7Goj
According to the execution plan, it is performing a 'left semi join' operation between the inner and outer query. This appears to be like an intersect query... but still pointless because it's working on the 1 table.

Ecto.Adapters.SQL.query! gives a different result

So this is apparently one of these weird days... And I know this makes 0 sense.
I'm executing a query in datagrip (a tool to execute raw querys) to the exact same database as in my phoenix application. And they are returning different results.
The query is quite complicated, but it's the only query that shows different results. So I cannot simplify it. I've tried other queries to be sure that I'm having the same database, restarted the server etc.
Here is the exact same query executed from my console. As you can see it is not the same result. A few rows are missing.
I have also checked if this is a timing issue by executing select now() => same result (more or less obviously). If I execute only the generate_series part, it returns the same result. So it could have something to do with the join.
I also checked the last few entries in the ttnmessages table just to be sure there is no general caching issue. The queries do also give the same result there.
So my question is: Is there anything that Ecto does differently upon executing a query? How can I figure this out? I'm grateful for any hint.
EDIT: The query is in both cases:
SELECT g.series AS time, MAX((t.payload ->'pulse')::text::numeric) as pulse
FROM generate_series(date_trunc('hour', now())- INTERVAL '12 hours', date_trunc('hour', now()), INTERVAL '60 min') AS g(series)
LEFT JOIN ttnmessages t
ON t.inserted_at < g.series + INTERVAL '60 min'
AND t.inserted_at > g.series
WHERE t.hardware_serial LIKE '093B55DF0C2C525A'
GROUP BY g.series
ORDER BY g.series;
While I did not find out the cause, I changed the query to the following:
SELECT MAX(t.inserted_at) as time, (t.payload ->'pulse')::text::numeric as pulse
FROM ttnmessages t
WHERE t.inserted_at > now() - INTERVAL '12 hours'
AND t.payload ->'pulse' IS NOT NULL
AND t.hardware_serial LIKE '093B55DF0C2C525A'
GROUP BY (t.payload ->'pulse')
ORDER BY time;
Runtime is < 50ms, so I'm happy with the result.
And I'll ignore the different results from the question. The query here returns the same result just like it's supposed to.

GROUP BY date error after update to MySQL 5.7

I have a simple script that counts form leads and displays the counts by month and year. It worked fine until I upgraded to MySQL 5.7. Now I get this error:
There was an error running the query [Expression #3 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'form.form_25.submission_date' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by]
My query is:
SELECT YEAR(`submission_date`) AS yr,
MONTH(`submission_date`) AS mth,
DATE_FORMAT(`submission_date`,'%M %Y') AS display_date,
COUNT(*) AS leadcount
FROM form_25
WHERE `submission_date` >= CURRENT_DATE - INTERVAL 1 YEAR
GROUP BY yr,mth
ORDER BY yr DESC, mth DESC
I realize this is because only_full_group_by is enabled, but I don't want to disable it.
I've researched this problem, but it seems like all of the suggested solutions are about grouping by a unique column. That isn't a solution in this case because grouping by my primary column does not display the lead counts properly.
Thanks in advance for your help.
Okay, I figured out a solution that is good enough for my purposes. I discovered that the error only happens when this line is included:
DATE_FORMAT(`submission_date`,'%M %Y') AS display_date,
So I removed that line and recreated the display_date variable in PHP by using the yr and mth aliases.

simple sql LIMIT statement

Can anyone tell me if this is a valid sql statement?..
SELECT * FROM myTable WHERE foo="bar" ORDER BY foo_date DESC LIMIT 0, 5
I've searched online but the difficulty is a) inexperience and b) finding an example that contains the components I'm trying to assemble.
I want the statement to select ALL from the table where condition1 is true, then order the results by descending date, then LIMIT the returned rows to those specified...
many thanks to anyone that can give me a pointer in the right direction.
Scott
SELECT * FROM myTable WHERE foo="bar" ORDER BY foo_date DESC LIMIT 0, 5
...is fine - runs in phpMyAdmin no problem. I should have tested there before posting on SO - apologies.

postgres syntax error in sql

select * from (
select max(h.updated_datetime) as max, min(h.updated_datetime) as min from report r, report_history h, procedure_runtime_information PRI, study S
where
h.report_fk=r.pk and
r.study_fk=S.pk and
PRI.pk=S.procedure_runtime_fk and
extract(epoch from (max(h.updated_datetime) - min(h.updated_datetime) ) <=900 and
h.pk IN (
select pk from
(select * from report_history where report_fk=r.pk) as result
)
and r.status_fk =21 group by r.pk)as result1;
this is my query i have a syntax error can any one help me fix this
thanks in advance
As you didn't bother telling us what the error is I have to guess, that it's this line:
AND h.pk IN (SELECT pk FROM (SELECT * FROM report_history WHERE report_fk=r.pk) AS RESULT)
The nesting level for the where condition is "too deep" and I think it cannot see the r alias in the where clause.
But the nested select is totally useless in your case anyway, so you can rewrite that condition as:
AND h.pk IN (SELECT pk FROM report_history WHERE report_fk=r.pk)
Even if that doesn't solve your problem, it makes your query more readable.
Then you are using an aggregate in the where clause which is also not allowed, you have to move it to a having clause.
having extract(epoch from (max(h.updated_datetime) - min(h.updated_datetime))) <=900
The having clause comes after the group by
You were also missing a closing ) but that is hard to tell because of your formatting (which I find very hard to read)
You should also get used to explicit JOIN syntax. The implicit joins in the WHERE clause are error-prone and no longer recommended.