Apache Phoenix LIMIT OFFSET Error - apache-phoenix

I need to know what the error means and how to debug it.
Here is what I did.
Query1:
SELECT * FROM us_population ORDER BY population DESC;
Result1:
NY New York 8143197
CA Los Angeles 3844829
IL Chicago 2842518
TX Houston 2016582
PA Philadelphia 1463281
AZ Phoenix 1461575
TX San Antonio 1256509
CA San Diego 1255540
TX Dallas 1213825
CA San Jose 912332
Query2:
SELECT * FROM us_population ORDER BY population DESC LIMIT 5;
Result2:
NY New York 8143197
CA Los Angeles 3844829
IL Chicago 2842518
TX Houston 2016582
PA Philadelphia 1463281
Query3:
SELECT * FROM us_population ORDER BY population DESC LIMIT 5 OFFSET 5;
Result3:
Error: Error -1 (00000) : Error while executing SQL "SELECT * FROM vhen_test_population ORDER BY population DESC LIMIT 5 OFFSET 5": Remote driver error: RuntimeException: org.apache.phoenix.exception.PhoenixParserException: ERROR 602 (42P00): Syntax error. Missing "EOF" at line 1, column 69. -> PhoenixParserException: ERROR 602 (42P00): Syntax error. Missing "EOF" at line 1, column 69. -> MissingTokenException: (null exception message)
SQLState: 00000
ErrorCode: -1

Use the latest version of Phoenix,
Version 4.8.0 has Offset Support for Paged Queries
Please Refer to https://phoenix.apache.org/paged.html

Related

How to search records while ignoring the special characters in postgres

I have a table in Postgres that has records like
ID
Address
1
862 N Longbranch Road Voorhees, NJ 08043
2
7300 Overlook, Ave Moncks Corner, SC 29461
3
76 SW Green Lake, Street Sterling, VA 20164
4
597 Wintergreen St Erlanger, KY 41018
So for searching a specific address my query is simple
select * from profile where address ilike '7300 Overlook, Ave Moncks Corner, SC%'
This is returning record 2
What I want is
select * from profile where address ilike '7300 Overlook Ave Moncks Corner SC%'
(Please note that commas are missing in second query)
Even if the string inside ilike doesn't contain comma , result 2 should be returned.

Display MAX and 2nd MAX SALARY from the EMPLOYEE table

SELECT max(salary),
(SELECT MAX(SALARY) FROM EMPLOYEE
WHERE SALARY NOT IN(SELECT MAX(SALARY) FROM EMPLOYEE)) as 2ND_MAX_SALARY;
This is giving me the error: FROM keyword not found where expected
You want the top 2 of your table ordered by one of the columns (the FETCH NEXT clause is available from Oracle 12c R1)
SELECT Salary FROM Employee ORDER BY Salary DESC LIMIT 2
FETCH NEXT 2 ROWS ONLY;
Use
SELECT Salary FROM Employee ORDER BY Salary DESC LIMIT 2
FETCH NEXT 2 ROWS WITH TIES;
if you want to return all employees that have the 1st or 2nd highest salary: There might only be one highest salary amount in the company, but more than one employee who gets that amount. Those rows are the ties.
If you're on Oracle database version lower than 12c, rank analytic function might help.
For sample rows:
SQL> select * from employee order by salary desc;
ENAME SALARY
---------- ----------
KING 5000 --> highest salary
FORD 3000 --> Ford and Scott "share" the 2nd
SCOTT 3000 --> highest salary
JONES 2975
BLAKE 2850
CLARK 2450
ALLEN 1600
TURNER 1500
MILLER 1300
WARD 1250
MARTIN 1250
ADAMS 1100
JAMES 950
SMITH 800
14 rows selected.
In a subquery (or a CTE, as I did), calculate rank for each salary and then, in the main query, select rows that rank as to top salaries:
SQL> with temp as
2 (select ename,
3 salary,
4 rank() over (order by salary desc) rnk
5 from employee
6 )
7 select ename, salary
8 from temp
9 where rnk <= 2
10 order by rnk desc;
ENAME SALARY
---------- ----------
SCOTT 3000
FORD 3000
KING 5000
SQL>
SELECT MAX(salary) AS max_salary,
(SELECT MAX(salary)
FROM employee
WHERE salary NOT IN (SELECT MAX(salary)
FROM employee
)
) AS 2nD_max_salary
FROM employee;

Matching two date columns in a postgres query with multiple joined tables

I am getting all of the information that I need from the following query:
SELECT
o.id,
o.customer_context,
o.organization_name,
o.shipping_name,
o.shipping_street1,
o.shipping_city,
o.shipping_state,
o.shipping_postal_code,
order_total,
shipping_charge,
sales_tax_charge,
discount_amount,
charge_date,
ship_date,
o.email,
shipping_country,
c.status,
c.unsubscribe,
c.last_logon,
c.last_action,
c.full_name AS customer_name,
c.email AS customer_email,
c.billing_email AS customer_billing_email,
c.organization_name AS customer_org_name,
c.phone AS customer_phone,
li.valid_from_dt,
li.valid_thru_dt,
pr.name,
sum(wi.order_qty) AS printed_book_count
FROM
online_order_onlineorder AS o
LEFT OUTER JOIN online_order_weborderitem AS wi ON (wi.web_order_id = o.id
AND format = 'PRT')
LEFT OUTER JOIN customer_customer AS c ON (c.id = o.customer_id)
LEFT OUTER JOIN customer_customer_curriculum_license AS li ON (li.customer_id = c.id)
LEFT OUTER JOIN product_curriculumlicense AS pli ON (pli.product_ptr_id = li.license_id)
LEFT OUTER JOIN product_product AS pr ON (pr.id = pli.product_ptr_id)
WHERE
o.status in('F', 'FF')
AND o.charge_date >= '2019-03-05'
AND o.charge_date < '2020-10-05'
GROUP BY
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28
ORDER BY
charge_date,
shipping_name
The only problem is that I can not get the o.charge_date to match the li.valid_from_dt
I have tried adding comparison operators such as:
WHERE
o.status in('F', 'FF')
AND o.charge_date >= '2019-03-05'
AND o.charge_date < '2020-10-05'
AND li.valid_from_dt >= '2019-03-05'
AND li.valid_from_dt < '2020-10-05'
but as expected it just limits the pool of li.valid_from_dt and still doesn't match up with o.charge_date I also need to account for the fact that certain orders will have NULL for the li.valid_from_dt
The only relations between o.charge_date and li.valid_from_dt is that they both share a relation with the c.id is there some way to bring these tables together to match the two dates, and keep the all of the other data the same?
I have spent a while working on this and any help is greatly appreciated.
EDIT: Additional info, here is an example of the customer_customer_curriculum_license table from the same customer.
id
valid_from_dt
valid_thru_dt
created_on
updated_on
purchase_price
max_head_count
customer_id
license_id
2262
2014-06-03
2015-06-03
2015-06-24 18:35:36.884+00
2015-07-01 21:43:55.125+00
440.00
29
4178
1
2263
2014-06-03
2015-06-03
2015-06-24 18:35:36.888+00
2015-07-01 21:43:55.128+00
440.00
19
4178
17
2264
2014-06-03
2015-07-13
2015-06-24 18:35:36.891+00
2015-06-29 21:55:30.095+00
440.00
29
4178
13
2265
2014-06-03
2015-07-13
2015-06-24 18:35:36.894+00
2015-06-29 21:54:16.496+00
440.00
19
4178
20
2266
2014-06-03
2015-07-13
2015-06-24 18:35:36.897+00
2015-07-01 21:43:55.126+00
440.00
29
4178
14
2267
2014-06-03
2015-07-13
2015-06-24 18:35:36.901+00
2015-06-29 21:41:29.784+00
440.00
29
4178
16
And an example of the online_order_onlineorder table.
id
status
customer_context
email
phone
billing_name
billing_street1
billing_street2
billing_state
billing_city
billing_postal_code
shipping_name
shipping_street1
shipping_street2
shipping_state
shipping_city
shipping_postal_code
order_total
shipping_charge
sales_tax_rate
sales_tax_charge
discount_amount
authorization_code
reference_number
transaction_id
created_on
updated_on
ship_date
charge_date
is_shipped
tracking_number
applied_offer_id
customer_id
billing_country
shipping_country
shipping_option_id
shipping_weight
customer_name
organization_contact
organization_name
gift_message
is_gift
418
FF
O
example#example.com
0.00
0.00
0.00000
0.00
0.00
2012-06-28 05:00:00+00
2015-06-24 18:40:55.194+00
f
4177
US
0.00
f
420
FF
O
example#example.com
0.00
0.00
0.00000
0.00
0.00
2012-07-05 05:00:00+00
2015-06-24 18:40:55.214+00
f
4177
US
0.00
f
The only related field between the two tables is customer_id and I need to match the o.charge_date and the li.valid_from_dt with the same o.customer_id to get the date when they purchased and when the license started that same day per each order.
My Results, as you can see the customer ordered 2019-03-05 16:02:24.10583+00, but the valid_from_dt is incorrect, it should be the same date the customer ordered.
id
customer_context
organization_name
shipping_name
shipping_street1
shipping_city
shipping_state
shipping_postal_code
order_total
shipping_charge
sales_tax_charge
discount_amount
charge_date
ship_date
email
shipping_country
status
unsubscribe
last_logon
last_action
customer_name
customer_email
customer_billing_email
customer_org_name
customer_phone
valid_from_dt
valid_thru_dt
name
printed_book_count
33733
O
None
John Doe
111 test
Test
HI
99999
180.00
0.00
0.00
0.00
2019-03-05 16:02:24.10583+00
example#example.com
United States (Domestic And Apo/Fpo/Dpo Mail)
O
f
2020-08-19 13:49:26.338082+00
None
John Doe
example#example.com
2017-04-25
2018-04-24
Valid for 365 Days
Well, from the much appreciated help from Richard Huxton, I have settled with the following SQL query.
EDITED, the previous query had an issue. The following is the correct query to use. I converted both o.charge_date and li.valid_from_dt using ::date, I was then able to match on the exact data that I needed!
SELECT DISTINCT ON (o.id)
o.id,
o.customer_context,
o.organization_name,
o.shipping_name,
o.shipping_street1,
o.shipping_city,
o.shipping_state,
o.shipping_postal_code,
order_total,
shipping_charge,
sales_tax_charge,
discount_amount,
charge_date,
ship_date,
o.email,
shipping_country,
c.status,
c.unsubscribe,
c.last_logon,
c.last_action,
c.full_name AS customer_name,
c.email AS customer_email,
c.billing_email AS customer_billing_email,
c.organization_name AS customer_org_name,
c.phone AS customer_phone,
li.valid_from_dt,
li.valid_thru_dt,
pr.name,
sum(wi.order_qty) AS printed_book_count
FROM
online_order_onlineorder AS o
LEFT OUTER JOIN online_order_weborderitem AS wi ON (wi.web_order_id = o.id
AND format = 'PRT')
LEFT OUTER JOIN customer_customer AS c ON (c.id = o.customer_id)
LEFT OUTER JOIN customer_customer_curriculum_license AS li ON (li.valid_from_dt::date = o.charge_date::date)
LEFT OUTER JOIN product_curriculumlicense AS pli ON (pli.product_ptr_id = li.license_id)
LEFT OUTER JOIN product_product AS pr ON (pr.id = pli.product_ptr_id)
WHERE
o.status in('F', 'FF')
AND o.charge_date >= '2019-03-05'
AND o.charge_date < '2020-10-05'
GROUP BY
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28
ORDER BY
id,
charge_date,
shipping_name

split_part function from nth split till end of string

After the second space, I need to fetch the values till the particular position in the string.
Source:
"8 115 MACKIE STREET VICTORIA PARK WA 6100 AU"
"6A CAMBOON ROAD MORLEY WA 6062 AU"
output:
"MACKIE STREET VICTORIA PARK"
"CAMBOON ROAD MORLEY"
I'm trying to split the street name and suburb from the unit #,street# present in the beginning and the state, postcode, country present in the end.
t=# with s(v) as (values('6A CAMBOON ROAD MORLEY WA 6062 AU'),('8 115 MACKIE STREET VICTORIA PARK WA 6100 A'))
, split as (select *,count(1) over (partition by v) from s, regexp_matches(v,'( [A-Z]+)','g') with ordinality t(m,o))
select distinct v,string_agg(m[1],'') over (partition by v) from split where o <= count-(3-1);
v | string_agg
---------------------------------------------+------------------------------
8 115 MACKIE STREET VICTORIA PARK WA 6100 A | MACKIE STREET VICTORIA PARK
6A CAMBOON ROAD MORLEY WA 6062 AU | CAMBOON ROAD MORLEY
(2 rows)
I excluded index (or any not fitting mask [A-Z]+) thus cutting not three positions from the end, but two (3-1) where 1 is ahead known index.
Also I start not from the second space as it would be against your desired result...

Preserve the order of distinct inside string_agg

My SQL function:
with recursive locpais as (
select l.id, l.nome, l.tipo tid, lp.pai
from loc l
left join locpai lp on lp.loc = l.id
where l.id = 12554
union
select l.id, l.nome, l.tipo tid, lp.pai
from loc l
left join locpai lp on lp.loc = l.id
join locpais p on (l.id = p.pai)
)
select * from locpais
gives me
12554 | PARNA Pico da Neblina | 9 | 1564
12554 | PARNA Pico da Neblina | 9 | 1547
1547 | São Gabriel da Cachoeira | 8 | 1400
1564 | Santa Isabel do Rio Negro | 8 | 1400
1400 | RIO NEGRO | 7 | 908
908 | NORTE AMAZONENSE | 6 | 234
234 | Amazonas | 5 | 229
229 | Norte | 4 | 30
30 | Brasil | 3 |
which is a hierarchy of places. "PARNA" stands for "National Park", and this one covers two cities: São Gabriel da Cachoeira and Santa Isabel do Rio Negro. Thus it's appearing twice.
If I change the last line for
select string_agg(nome,', ') from locpais
I get
"PARNA Pico da Neblina, PARNA Pico da Neblina, São Gabriel da
Cachoeira, Santa Isabel do Rio Negro, RIO NEGRO, NORTE AMAZONENSE,
Amazonas, Norte, Brasil"
Which is almost fine, except for the double "PARNA Pico da Neblina". So I tried:
select string_agg(distinct nome, ', ') from locpais
but now I get
"Amazonas, Brasil, Norte, NORTE AMAZONENSE, PARNA Pico da Neblina, RIO
NEGRO, Santa Isabel do Rio Negro, São Gabriel da Cachoeira"
Which is out of order. I'm trying to add an order by inside the string_agg, but couldn't make it work yet. The definition of the tables were given here.
As you've found out, you cannot combine DISTINCT and ORDER BY if you don't order by the distinct expression first:
neither in aggregates:
If DISTINCT is specified in addition to an order_by_clause, then all the ORDER BY expressions must match regular arguments of the aggregate; that is, you cannot sort on an expression that is not included in the DISTINCT list.
nor in SELECT:
The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s).
However could use something like
array_to_string(arry_uniq_stable(array_agg(nome ORDER BY tid DESC)), ', ')
with the help of a function arry_uniq_stable that removes duplicates in an array w/o altering it's order like I gave an example for in https://stackoverflow.com/a/42399297/5805552
Please take care to use an ORDER BY expression that actually gives you an deterministic result. With the example you have given, tid alone would be not enough, as there are duplicate values (8) with different nome.
select string_agg(nome,', ')
from (
select distinct nome
from locpais
order by tid desc
) s