Hive FAILED: Parse Exception line 3:39 mismatched input - hiveql

Im trying to make a query that will check if there is any row which has a salary that is 10000 higher than the salary for that department but when I try to run it I get this Error:
FAILED: ParseException line 3:39 mismatched input 'SELECT' expecting ) near '''' in expression specification
this is the query Im using
set AVERAGES ='SELECT ROLE, AVG(AnnualSalary) From Salaries GROUP BY ROLE';
SELECT ROLE, AVG(AnnualSalary) FROM Salaries
GROUP BY ROLE, AnnualSalary HAVING AnnualSalary > ('${hiveconf:AVERAGES}' + 10000);

Currently Hive does not support storing the query result into variable.
You can use window function to achieve this.
select * from
( select *,
avg(AnnualSalary) over(partition by ROLE) role_avg
from
Salaries
) a
where
AnnualSalary > role_avg+10000

Related

LPAD function errors when used in WITH variable in Redshift

Can you tell me why this is throwing an error in Redshift?
WITH Testing_PADDING AS (SELECT '12345678' AS column1)
SELECT LPAD(column1, 9,'0') FROM Testing_PADDING;
Here is the error I receive:
"Invalid operation: failed to find conversion function from "unknown" to text;"
Redshift can't determine data type from the context, so you need to explicitly set it
WITH Testing_PADDING AS (SELECT '12345678'::text AS column1)
SELECT
LPAD(column1, 9, '0')
FROM Testing_PADDING;
I suspect that one of your strings isn't being seen as text - likely the column1 text. (Sorry don't have a cluster up not to test)
Try:
WITH Testing_PADDING AS (SELECT '12345678'::text AS column1)
SELECT LPAD(column1, 9,'0'::text) FROM Testing_PADDING;

It can't calculate count(*) in a query onto DB2 database

I want to do a count(*) of the number of a rows fom a DB2 database.
The basic query is the following:
select
SUBSTR("Request_Detail",LOCATE('/',"Request_Detail")+1,LOCATE('/',"Request_Detail",LOCATE('/',"Request_Detail")+1)-LOCATE('/',"Request_Detail"))
from "Request_Analisys"
WHERE
"Sample_Date_and_Time">=1200323230000000 and "Sample_Date_and_Time"<1200332300000000
and "Request_Detail" <> '[Summary]'
and "Request_Detail" not like 'WS:%'
Now I'd like to do a count(*) of the resulting rows, but if I do a query like this:
select
count(*),
SUBSTR("Request_Detail",LOCATE('/',"Request_Detail")+1,LOCATE('/',"Request_Detail",LOCATE('/',"Request_Detail")+1)-LOCATE('/',"Request_Detail"))
from "Request_Analisys"
WHERE
"Sample_Date_and_Time">=1200323230000000 and "Sample_Date_and_Time"<1200332300000000
and "Request_Detail" <> '[Summary]'
and "Request_Detail" not like 'WS:%'
It gives the error:
18:51:58 FAILED [SELECT - 0 rows, 0.032 secs] 1) [Code: -119, SQL State: 42803] An expression starting with "Request_Detail" specified in a SELECT clause, HAVING clause, or ORDER BY clause is not specified in the GROUP BY clause or it is in a SELECT clause, HAVING clause, or ORDER BY clause with a column function and no GROUP BY clause is specified.. SQLCODE=-119, SQLSTATE=42803, DRIVER=4.22.29
2) [Code: -727, SQL State: 56098] An error occurred during implicit system action type "2". Information returned for the error includes SQLCODE "-119", SQLSTATE "42803" and message tokens "Request_Detail".. SQLCODE=-727, SQLSTATE=56098, DRIVER=4.22.29
How could I do to get the count of the rows?
Which Request_Detail line's substr would you think it shows after the count?
If you count the lines, the result set will be a single line, and using any columns in it makes no sense.
If you want multiple lines, with a count for each found substr, you need to GROUP BY this substr.
This may work...
select
count(
SUBSTR("Request_Detail"
,LOCATE('/',"Request_Detail")+1
,LOCATE('/',"Request_Detail",LOCATE('/',"Request_Detail")+1)
-LOCATE('/',"Request_Detail")))
)
from "Request_Analisys"
WHERE
"Sample_Date_and_Time">=1200323230000000 and "Sample_Date_and_Time"<1200332300000000
and "Request_Detail" <> '[Summary]'
and "Request_Detail" not like 'WS:%'
But if not this should..
with cte as (
select
SUBSTR("Request_Detail"
,LOCATE('/',"Request_Detail")+1
,LOCATE('/',"Request_Detail",LOCATE('/',"Request_Detail")+1)
-LOCATE('/',"Request_Detail"))) as mydetail
from "Request_Analisys"
WHERE
"Sample_Date_and_Time">=1200323230000000 and "Sample_Date_and_Time"<1200332300000000
and "Request_Detail" <> '[Summary]'
and "Request_Detail" not like 'WS:%'
)
select count(*) from cte
I suggest you use REGEXP_EXTRACT to pick what you want out of your "Request_Detail" column. This is more flexable than using SUBSTR and LOCATE, and will avoid the statement was not executed because a numeric argument of a scalar function is out of range.. error
e.g
select
REGEXP_EXTRACT("Request_Detail",'.*/(.+/)',1,1,'',1)
, SUBSTR("Request_Detail",LOCATE('/',"Request_Detail")+1,LOCATE('/',"Request_Detail",LOCATE('/',"Request_Detail")+1)-LOCATE('/',"Request_Detail"))
FROM TABLE(VALUES('aaaa/bbbb/ccc')) AS T("Request_Detail")
returns
1 |2
------|-----
bbbb/ |bbbb/
so, you could then do this
SELECT
COUNT(*)
, REGEXP_EXTRACT("Request_Detail",'.*/(.+/)',1,1,'',1)
FROM
"Request_Analisys"
GROUP BY
REGEXP_EXTRACT("Request_Detail",'.*/(.+/)',1,1,'',1)
for example

syntax error at or near "'select to_char(application_date::timestamp, '"

EXECUTE 'select to_char(application_date::timestamp, 'Mon-YY') as appl_month from my_schema.my_table;';
The above PostgreSQL EXECUTE statement is giving the below error:
ERROR: syntax error at or near "'select
to_char(application_date::timestamp, '" LINE 1: EXECUTE 'select
to_char(application_date::timestamp, 'Mon-YY...
^
********** Error **********
ERROR: syntax error at or near "'select
to_char(application_date::timestamp, '" SQL state: 42601 Character: 9
Any suggestions will be helpful.
Changed to below statement
EXECUTE 'select to_char(application_date::timestamp, ' || quote_literal(Mon-YY) || ') from standard.npo_weekly_export;';
But giving new error:
ERROR: syntax error at or near "'select to_char(application_date::timestamp, '"
LINE 1: EXECUTE 'select to_char(application_date::timestamp, ' || qu...
^
********** Error **********
ERROR: syntax error at or near "'select to_char(application_date::timestamp, '"
SQL state: 42601
Character: 9
Expected Output: - Counts by month in Mon-YY format
Application month Application # Final Approval #
Jan-17 1,000 800
Feb-17 1,010 808
Mar-17 1,020 816
Apr-17 1,030 824
If I do the below query:
select to_char(application_date, 'Mon-YY') as appl_month,
count(distinct application_id) as appl_count,
sum(final_approval_ind) as fa_count,
from my_schema.my_table
group by appl_month
order by appl_month;
Generated output: (Note: Sorted by text, not by date)
"Apr-17";94374;19953
"Apr-18";87446;20903
"Aug-17";102043;21536
"Aug-18";91107;20386
"Dec-17";63263;13755
"Dec-18";21358;74
"Feb-17";89447;18084
"Feb-18";75426;16144
"Jan-17";86103;16394
"Jan-18";79403;17766
"Jul-17";90380;18929
"Jul-18";85439;20186
"Jun-17";95596;20403
"Jun-18";85764;18707
"Mar-17";112929;23323
"Mar-18";91179;21841
"May-17";101907;22349
"May-18";90885;21550
"Nov-17";78284;16791
"Nov-18";80472;7656
"Oct-17";87955;18524
"Oct-18";82821;17056
"Sep-17";80740;17788
"Sep-18";75785;18009
Problem: to_char() returns text and it sorts by text and not by date. So the output is jumbled rather than sorted by Mon-YY.
Do the aggregation in a derived table (aka "sub-query") that preserves the data type, then do the sorting in the outer query:
select to_char(ap_month, 'Mon-YY') as appl_month
appl_count,
fa_count
from (
select date_trunc('month', application_date) as ap_month,
count(distinct application_id) as appl_count,
sum(final_approval_ind) as fa_count,
from my_schema.my_table
group by ap_month
) t
order by ap_month;
date_trunc('month', application_date) will normalize the application_date to the start of the month, but will retain the date data type, so that the sorting in the outer query works correctly.
I have no idea what the dynamic SQL in your question is supposed to do, but if you need to use that query for whatever reasons as dynamic SQL, you need to escape the single quotes by doubling them.
execute '
select to_char(ap_month, ''Mon-YY'') as appl_month
appl_count,
fa_count
from (
select date_trunc(''month'', application_date) as ap_month,
count(distinct application_id) as appl_count,
sum(final_approval_ind) as fa_count,
from my_schema.my_table
group by ap_month
) t
order by ap_month;
'; -- end of dynamic SQL
But using Postgres' dollar quoting would be easier:
execute $dyn$
select to_char(ap_month, 'Mon-YY') as appl_month
appl_count,
fa_count
from (
select date_trunc('month', application_date) as ap_month,
count(distinct application_id) as appl_count,
sum(final_approval_ind) as fa_count,
from my_schema.my_table
group by ap_month
) t
order by ap_month;
$dyn$; -- end of dynamic SQL
Note that you can nest dollar quoted strings, so if that query is used inside a function, just use a different delimiter than you use for the function body (see the example in the manual)

PostgreSql Group By and aggreate function error

My problem is, when I run the following query in MySQL, it looks like this
Query;
SELECT
CONCAT(b.tarih, '#', CONCAT(b.enlem, ',', b.boylam), '#', b.aldigi_yol) AS IlkMesaiEnlemBoylamImei,
CONCAT(tson.max_tarih, '#', CONCAT(tson.max_enlem, ',', tson.max_boylam), '#', tson.max_aldigi_yol) AS SonMesaiEnlemBoylamImei,
Max(CAST(b.hiz AS UNSIGNED)) As EnYuksekHiz,
TIME_FORMAT(Sec_TO_TIME(TIMESTAMPDIFF(SECOND, (b.tarih), (tson.max_tarih))), '%H:%i') AS DurmaSuresi
FROM
(Select id as max_id, tarih as max_tarih, enlem as max_enlem, boylam as max_boylam, aldigi_yol as max_aldigi_yol from _213gl2015016424 where id in(
SELECT MAX(id)
FROM _213gl2015016424 where (tarih between DATE('2016-11-30 05:45:00') AND Date('2017-01-13 14:19:06')) AND CAST(hiz AS UNSIGNED) > 0
GROUP BY DATE(tarih))
) tson
LEFT JOIN _213gl2015016424 a ON a.id = tson.max_id
LEFT JOIN _213gl2015016424 b ON DATE(b.tarih) = DATE(a.tarih)
WHERE b.tarih is not null And (b.tarih between DATE('2016-11-30 05:45:00') AND Date('2017-01-13 14:19:06')) AND b.hiz > 0
GROUP BY tson.max_tarih
Output is order by date;
Result query
When I try to run a query in PostgreSQL, I get group by mistake.
Query;
SELECT
CONCAT(b.tarih, '#', CONCAT(b.enlem, ',', b.boylam), '#', b.toplamyol) AS IlkMesaiEnlemBoylamImei,
CONCAT(tson.max_tarih, '#', CONCAT(tson.max_enlem, ',', tson.max_boylam), '#', tson.max_toplamyol) AS SonMesaiEnlemBoylamImei,
Max(CAST(b.hiz AS OID)) As EnYuksekHiz,
to_char(to_timestamp((extract(epoch from (tson.max_tarih)) - extract(epoch from (b.tarih)))) - interval '2 hour','HH24:MI') AS DurmaSuresi
FROM
(Select id as max_id, tarih as max_tarih, enlem as max_enlem, boylam as max_boylam, toplamyol as max_toplamyol from _213GL2016008691 where id in(
SELECT MAX(id)
FROM _213GL2016008691 where (tarih between DATE('2018-02-01 03:31:54') AND DATE('2018-03-01 03:31:54')) AND CAST(hiz AS OID) > 0
GROUP BY DATE(tarih))
) tson
LEFT JOIN _213GL2016008691 a ON a.id = tson.max_id
LEFT JOIN _213GL2016008691 b ON DATE(b.tarih) = DATE(a.tarih)
WHERE b.tarih is not null And (b.tarih between DATE('2018-02-12 03:31:54') AND DATE('2018-02-13 03:31:54')) AND b.hiz > 0
GROUP BY tson.max_tarih
Group by error is : To use the aggregate function, you must add the column "b.tarih" to the GROUP BY list.
When I add it I get the same error for another column.I'm waiting for your help.
You are using a feature of MySQL that is not standard SQL and you can also deactivate.
You are grouping by tson.max_tarih in your query. That means that for all rows that share the same value in that field, you will get only one row as a result of that group.
If you have several different values in the rest of the fields (enlem, boylam, etc...) which one are you trying to get in as the result of the query? That's the question that PostgreSQL is asking you.
MySQL is just returning any value for those fields among the rows in the group. PostgreSQL requires you to actually specify it.
Two typical solutions would be grouping by the rest of the fields (b.tarih, b.enlem) or specifying the value those fields to something like MAX(b.tarih), etc.

ERROR: missing FROM-clause entry for table "movies"

I am new to SQL and need to query a database to extract certain information before I can import it into another software I am familiar with to analyse the data. This query was sent to me by a friend who I don't have access to at the moment, and I cannot figure out why it gives me the following error:
ERROR: missing FROM-clause entry for table "movies"
LINE 8: FROM (SELECT movies.movieid
Here is the query:
SELECT innerselect.movieid
,innerselect.title
,innerselect.year
,innerselect.imdbid
,innerselect.budget[1] AS budget_currency
,TO_NUMBER(innerselect.budget[2], '999999999999990.00') AS budget_total
,innerselect.businesstext
FROM (SELECT movies.movieid
,movies.title
,movies.year
,movies.imdbid
,business.businesstext
,regexp_matches(business.businesstext, '^BT:[ ](USD)[ ](-?(?!0)(?:\d+|\d{1,3}(?:,\d{3})+))', 'g') AS budget -- creates a PostgreSQL Array which contains the content matched with the RegEx Groups FROM movies LEFT JOIN business ON movies.movieid=business.movieid WHERE movies.movieid > 2753500
) AS innerselect
Any help would be greatly appreciated.
Problem is you put the FROM on the same line as the comment, so the FROM clause was ignored.
SELECT innerselect.movieid
,innerselect.title
,innerselect.year
,innerselect.imdbid
,innerselect.budget[1] AS budget_currency
,TO_NUMBER(innerselect.budget[2], '999999999999990.00') AS budget_total
,innerselect.businesstext
FROM (SELECT movies.movieid
,movies.title
,movies.year
,movies.imdbid
,business.businesstext
,regexp_matches(business.businesstext, '^BT:[ ](USD)[ ](-?(?!0)(?:\d+|\d{1,3}(?:,\d{3})+))', 'g') AS budget -- creates a PostgreSQL Array which contains the content matched with the RegEx Groups
FROM movies LEFT JOIN business ON movies.movieid=business.movieid WHERE movies.movieid > 2753500
) AS innerselect