ClickHouse left join between - left-join

I have two queries:
The first one I have datetime of payment
In the second one I have several sessions
I need to "left joint t2 on t1.datetime between t2.startDatetime and t2.endDatetime" to keep only one row which meet "between" statement
I know it works with SQL, but I can't find any information about same feature in ClickHouse.

Related

What does ORDER BY within OVER() window function mean in postgresql?

I am trying to understand how the ORDER BY clause in the OVER() window function is different from ORDER BY clause in generic SQL.
I was solving the following problem: https://www.pgexercises.com/questions/aggregates/nummembers.html
Produce a monotonically increasing numbered list of members (including guests),
ordered by their date of joining. Remember that member IDs are not guaranteed to be sequential.
The following query is one of the accepted solutions:
SELECT COUNT(*) OVER(ORDER by joindate), firstname, surname FROM cd.members;
As per my understanding, since we are not supplying a PARTITION BY clause in the OVER() function, all the rows in cd.members table form one big partition (let's call it X). When the window function runs, it should order X by joindate, and then COUNT(*) on X would return the number of rows in X which is just the number of rows in cd.members.
But this understanding is incorrect. The 'Answers and Discussion' accompanying the aforementioned problem states:
Since we define an order for the window function, for any given row the window is: start of the dataset -> current row.
The PG documentation on window function states:
You can also control the order in which rows are processed by window functions using ORDER BY within OVER. (The window ORDER BY does not even have to match the order in which the rows are output.)
What I cannot comprehend is why will ORDER BY inside the OVER() stop at the current row? Could you please elaborate how this is working?
Thank you for reading through.
I don't know what to add beyond what the docs (same page as you already linked to) already say:
By default, if ORDER BY is supplied then the frame consists of all
rows from the start of the partition up through the current row, plus
any following rows that are equal to the current row according to the
ORDER BY clause.
I don't now if this required by the SQL standard, but it certainly seems reasonable. Why specify an ORDER BY if you expect it to have no observable effect?

How to count the maximum value and make sql display one single row

I am learning SQL and want to make the following:
I need to get the highest value from 2 different tables. OUTPUT Displays all rows, however I need a single row with the maximum value.
P.S. LIMIT 1 is not working in SQL Server Management Studio
SELECT Players.PlayersID, MAX (Participants.EventsID) AS Maximum FROM Players
LEFT JOIN Participants ON Players.PlayersID = Participants.PlayersID
GROUP BY Players.PlayersID
I clearly understand that this can be a dumb question for pros, however Google did not help. Thanks for understanding and your help.
Try using TOP:
SELECT TOP 1
pl.PlayersID,
MAX(pa.EventsID) AS Maximum
FROM Players pl
LEFT JOIN Participants pa
ON pl.PlayersID = pa.PlayersID
GROUP BY
pl.PlayersID
ORDER BY
MAX(pa.EventsID) DESC;
If you want to cater for the possibility of two players being tied for the same maximum, then use TOP 1 WITH TIES instead of just TOP 1.

Tableau Left Join not returning all rows

I have data sources I want to join on an ID column.
The main table (left) ID are all non null.
The main table is a WDC data source and the second is a table from MySQL database.
So I did could not do the join outside of Tableau.
When retrieved alone, it can show all rows on incremental refresh. However when I make the join (left join) it shows only few rows.
I first thought it was a filter issues so I disabled all filters. But the behavior is still the same. I even made a new fresh file with the data sources only but the behavior is the same.
When I a make a blend all the rows can be retrieved, but I want to add a filter on the second data source so blending is not the solution for me.
If someone could give one hint it will save my day.
Thanks
What are the data types of the two columns? Maybe they differ which is why blending could give you desired results. Use a calculated join and cast to varchar

Concatenate Multiple Returned Rows Into One Row (standard methods don't work for some reason)

I have a relatively noobish question. I say this because I feel I am just missing the obvious here. I am simply doing what many have done and asked about previously, but typical methods I have used before are not working. Hopefully it's just me missing something simple.
Below is part of a bigger query I am working on, but I am simply trying to combine two rows with only one column of data different, into one row with that similar column separated by a delimiter. Easy enough with a CONCAT or STRING_AGG right?....well doesn't work for me and I don't know why.
SELECT array_to_string(array_agg(ls_number), ',') "ls_number",
--Also tried CONCAT(ls_number, ',') and string_agg(ls_number, ',')
--and they don't work
shipitem_shiphead_id,
shipitem_orderitem_id,
shiphead_number
FROM shipitem
LEFT JOIN invhist
ON (shipitem_invhist_id=invhist_id)
LEFT JOIN invdetail
ON (invhist_id=invdetail_invhist_id)
LEFT JOIN ls
ON (invdetail_ls_id=ls_id)
LEFT JOIN shiphead
ON (shiphead_id = shipitem_shiphead_id)
WHERE shiphead_number = '72211'
GROUP BY ls_number,
shiphead_number,
shipitem_shiphead_id,
shipitem_orderitem_id
The results when the above query is ran:
And you can see from the above results window that the Lot Numbers are split into 2 rows. I need them to be on one row, with the Lot Numbers separated by the delimiter ','. Can someone explain what I am missing here? Thanks a bunch in advance!
You have ls_number in your group by clause, meaning you'll get a different row for every distinct value of it in your result. Remove it from the group by clause and you should be OK.

Faster CROSS JOIN alternative - PostgreSQL

I am trying to CROSS JOIN two tables, customers and items, so I can then create a sales by customer by item report. I have 2000 customer and 2000 items.
SELECT customer_name FROM customers; --Takes 100ms
SELECT item_number FROM items; --Takes 50ms
SELECT customer_name, item_number FROM customers CROSS JOIN items; Takes 200000ms
I know this is 4 million rows, but is it possible to get this to run any faster? I want to eventually join this with a sales table like this:
SELECT customer_name, item_number, sales_total FROM customers CROSS JOIN items LEFT JOIN sales ON (customer.customer_name = sales.customer_name, item.item_number=sales.item_number);
The sales table will obviously not have all customers or all items, so the goal here is to have a report that shows all customers and all items along with what was sold and not sold.
I'm using PostgreSQL 8.4
To answer your question: No, you can't do a cross join faster than that - if you could then that would be how CROSS JOIN would be implemented.
But really you don't want a cross join. You probably want two separate queries, one which lists all customers, and another which lists all items and whether or not they were sold.
This really needs to be multiple reports. I can think of several off the top of my head that will yield more efficient packaging of information:
Report: count of all purchases by customer/item (obvious).
Report: list of all items not purchased, by customer.
Report: Summary of Report #2 (count of items) in order to prioritize which customers to focus on.
Report: list of all customer that have not bought an item by item.
Report: Summary of Report #3 (count of customers) in order to identify both the most popular and unpopular items for further action.
Report: List of all customers who purchased an item in the past, but did not purchase it his reporting period. This report is only relevant when the sales table has a date and the customers are expected to be regular buyers (i.e. disposable widgets). Won't work as well for things like service contracts.
The point here is that one should not insist that the tool process every possible outcome at once and generate more data and anyone could possibly digest manually. One should engage the end-users and consumers of the data as to what their needs are and tailor the output to meet those needs. It will make both sides' lives much easier in the long run.
If you wish to see all items for a given client (even if the cient has no items), i would rather try
SELECT c.customer_name, i.item_number, s.sales_total
FROM customers c LEFT JOIN
sales s ON c.customer_name = s.customer_name LEFT OIN
items i on i.item_number=s.item_number
This should give you a list of all clients, and all items joined by sales.
Perhaps you want something like this?
select c.customer_name, i.item_number, count( s.customer_name ) as total_sales
from customers c full join sales s on s.customer_name = c.customer_name
full join items i on i.item_number = s.item_number
group by c.customer_name, i.item_number