How to write a hive query to display all records if one particular record is present? - hiveql

I have to write a hive query to display a full table if a particular record is present in the table.
For example:
I have a table customers
( cust _name,cust_num,city,rating)
I have to display all the records if city San Jose is present in this table.
how can i write my query using select statement?

You could do this:
select a.* from customers a
inner join customers b on b.city = 'San Jose'

Try this query:
SELECT * FROM customers WHERE city='San Jose'

Related

Postgres SQL query group by get most recent record instead of an aggregate

This is a current postgres query I have:
sql = """
SELECT
vms.campaign_id,
avg(vms.open_rate_uplift) as open_rate_average,
avg(vms.click_rate_uplift) as click_rate_average,
avg(vms.conversion_rate_uplift) as conversion_rate_average,
avg(cms.incremental_opens),
avg(cms.incremental_clicks),
avg(cms.incremental_conversions)
FROM
experiments.variant_metric_snapshot vms
INNER JOIN experiments.campaign_metric_snapshot cms ON vms.campaign_id = cms.campaign_id
WHERE
vms.campaign_id IN %(campaign_ids)s
GROUP BY
vms.campaign_id
"""
whereby I get the average incremental_opens, incremental_clicks, and incremental_conversions per campaign group from the cms table. However, instead of the average, I want the most recent values for the 3 fields. See the cms table screenshot below - I want the values from the record with the greatest (i.e. most recent) event_id (instead of an average for all records) for a given group).
How can I do this? Thanks
It sounds like you want a lateral join.
FROM
experiments.variant_metric_snapshot vms
CROSS JOIN LATERAL (select * from experiments.campaign_metric_snapshot cms where vms.campaign_id = cms.campaign_id order by event_id desc LIMIT 1) cms
WHERE...
If you are after a quick and dirty solution you can use array_agg function with minimal change to your query.
SELECT
vms.campaign_id,
avg(vms.open_rate_uplift) as open_rate_average,
avg(vms.click_rate_uplift) as click_rate_average,
avg(vms.conversion_rate_uplift) as conversion_rate_average,
(array_agg(cms.incremental_opens ORDER BY cms.event_id DESC))[1] AS incremental_opens,
..
FROM
experiments.variant_metric_snapshot vms
INNER JOIN experiments.campaign_metric_snapshot cms ON vms.campaign_id = cms.campaign_id
WHERE
vms.campaign_id IN %(campaign_ids)s
GROUP BY
vms.campaign_id;

How to make a query with the data of another query on SQL?

I have a user table and another order table. A user can have many orders. How do I get the last 1000 users and those users use my last 1000 orders for each one?
Query - Get users
select distinct users.id, users.first_name, users.last_name
from users
limit 2;
Query - Get orders
select distinct orders.id, orders.user_id
from orders
limit 2;
First you need to know.. When it comes to query data don't do separate operation.. If you put out All Users and Orders Your data will be unordered and not consistency.. So you need to make Join to see All User with Order they have.. And as stated by #Sharon you need to add column date_ordered to see that order make.. I am assume you already have that column but i will call that column with date_ordered..
And your query will be :
select
users.id,
users.first_name,
users.last_name,
orders.id
from
users
inner join orders on users.id = orders.user_id
order by
orders.date_ordered desc
limit 1000
By order date_ordered use desc you will get the latest all the user with their order.. And i assume user_id column in table orders have constraint foreign key references to table users with column id..

Select from view and join from one table or another if no record available in first

I have a view and two tables. Tables one and two have the same columns, but table one is has as small number of records, and table two has old data and a huge number of records.
I have to join a view with these two tables to get the latest data from table one; if a record from the view is not available in table one then I have to select the record from table two.
How can i achieve this with MySQL?
I came to know by doing some research in internet that we can't apply full join and sub query in from clause.
Just do a simple UNION of the results excluding the records in table2 that are already mentioned in table1:
SELECT * FROM table1
UNION
SELECT * FROM table2
WHERE NOT EXISTS (SELECT * FROM table1 WHERE table2.id = table1.id)
Something like this.
SELECT *
FROM view1 V
INNER JOIN (SELECT COALESCE(a.commoncol, b.commoncol) AS commoncol
FROM table1 A
FULL OUTER JOIN table2 B
ON A.commoncol = B.commoncol) C
ON v.viewcol = c.commoncol
If you are using Mysql then check here to simulate Full Outer Join in MySQL
are you trying to update the view from two tables where old record in view needs to be overwritten by latest/updated record from table1 and non existant records from table1 to be appended from table2?
, or are you creating a view from two tables?

Select multiple column with out code duplication while joining two table #active record # rails 2.3

Let us consider two tables
table1 - name,id,publisher_name,exp_date
table2-book_id,price,discount,last_date
I have to retrieve the name, id,publisher_name from table1 and price, last_date from table2
I wrote a code in active record rails 2
Table1.find(:all,:select=>"table1.name,table1.publisher_name,table1.id,table2.last_date,table2.price",:joins=>"LEFT OUTER JOIN table1s on table1s.id= table2s.book_id")
in this code by selecting multiple column name we need write that table name repeatedly,
need a simple code to avoid this problem
if the selected columns are not present in both tables you don't need to write the tablename as a prefix. You also don't need to name the table2 in front of "book_id". You only need them if the column-names are ambigious.
Table1.find( :all, :select=> "name, publisher_name, id, last_date, price", :joins => "LEFT OUTER JOIN table1s on table1s.id = book_id")

PostgreSQL: Select first row as column inside select

I got 2 tables like Customers and Orders, in table Customers I got columns id, name, in table Orders I got columns id, customer_id, order_date.
Now I need to make one select that will return me each Customer's id, name and the last order_date.
I tried to make like this:
select
Customers.id,
Customers.name,
(select Orders.order_date from Orders where Orders.customer_id = Customer.id order by order_date desc) as last_order_date
from
Customers
But it get the wrong index and takes forever to execute.
Whats the best way to make this select in PostgreSQL?
Thanks in advanced.
If not restricting by customer_id, then the query will end up having to scan the entire orders table.
SELECT c.id
,c.name
,MAX(o.order_date) AS last_order_date
FROM Customers c
LEFT OUTER JOIN Orders o ON (o.customer_id = c.id)
GROUP BY c.id, c.name