Is distinct function deterministic? T-sql - tsql

I have table like below. For distinct combination of user ID and Product ID SQL will select product bought from store ID 1 or 2? Is it determinictic?
My code
SELECT (DISTINCT CONCAT(UserID, ProductID)), Date, StoreID FROM X

This isn't valid syntax. You can have
select [column_list] from X
or you can have
select distinct [column_list] from X
The difference is that the first will return one row for every row in the table while the second will return one row for every unique combination of the column values in your column list.
Adding "distinct" to a statement will reliably produce the same results every time unless the underlying data changes, so in this sense, "distinct" is deterministic. However, it is not a function so the term "deterministic" doesn't really apply.
You may actually want a "group by" clause like the following (in which case you have to actually specify how you want the engine to pick values for columns not in your group):
select
concat(UserId, ProductID)
, min(Date)
, max(Store)
from
x
group by
concat(UserId, ProductID)
Results:
results

Related

Create SQL Column Counting Frequency of Value in other column

I have the first three columns in SQL. I want to create the 4th column called Count which counts the number of times each unique name appears in the Name column. I want my results to appears like the dataset below, so I don't want to do a COUNT and GROUP BY.
What is the best way to achieve this?
We can try to use COUNT window function
SELECT *,COUNT(*) OVER(PARTITION BY name ORDER BY year,month) count
FROM T
ORDER BY year,month
sqlfiddle

Postgres query filter by non column in table

i have a challenge whose consist in filter a query not with a value that is not present in a table but a value that is retrieved by a function.
let's consider a table that contains all sales on database
id, description, category, price, col1 , ..... col n
i have function that retrieve me a table of similar sales from one (based on rules and business logic) . This function performs a query again on all records in the sales table and match validation in some fields.
similar_sales (sale_id integer) - > returns a integer[]
now i need to list all similar sales for each one present in sales table.
select s.id, similar_sales (s.id)
from sales s
but the similar_sales can be null and i am interested only return sales which contains at least one.
select id, similar
from (
select s.id, similar_sales (s.id) as similar
from sales s
) q
where #similar > 1 (Pseudocode)
limit x
i can't do the limit in subquery because i don't know what sales have similar or not.
I just wanted do a subquery for a set of small rows and not all entire table to get query performance gains (pagination strategy)
you can try this :
select id, similar
from sales s
cross join lateral similar_sales (s.id) as similar
where not isempty(similar)
limit x

How to limit to just one result per condition when looking through multiple OR/IN conditions in the WHERE clause (Postgresql)

For Example:
SELECT * FROM Customers
WHERE Country IN ('Germany', 'France', 'UK')
I want to LIMIT 1 for each of the countries in my IN clause so I only see a total of 3 rows: One customer for per country (1 German, 1 France, 1 UK). Is there a simple way to do that?
Normally, a simple GROUP BY would suffice for this type of solution, however as you have specified that you want to include ALL of the columns in the result, then we can use the ROW_NUMBER() window function to provide a value to filter on.
As a general rule it is important to specify the column to sort on (ORDER BY) for all windowing or paged queries to make the result repeatable.
As no schema has been supplied, I have used Name as the field to sort on for the window, please update that (or the question) with any other field you would like, the PK is a good candidate if you have nothing else to go on.
SELECT * FROM
(
SELECT *
, ROW_NUMBER() OVER(PARTITION BY Country ORDER BY Name) AS _rn
FROM Customers
WHERE Country IN ('Germany', 'France', 'UK')
)
WHERE _rn = 1
The PARTITION BY forces the ROW_NUMBER to be counted across all records with the same Country value, starting at 1, so in this case we only select the rows that get a row number (aliased as _rn) of 1.
The WHERE clause could have been in the outer query if you really want to, but ROW_NUMBER() can only be specified in the SELECT or ORDER BY clauses of the query, so to use it as a filter criteria we are forced to wrap the results in some way.

Transpose/Pivot a table in Postgres

I am trying for hours to transpose one table into another one this way:
My idea is to grab on an expression (which can be a simple SELECT * FROM X INNER JOIN Y ...), and transpose it into a MATERIALIZED VIEW.
The problem is that the original table can have an arbitrary number of rows (hence columns in the transposed table). So I was not able to find a working solution, not even with colpivot.
Can this ever be done?
Use conditional aggregation:
select "user",
max(value) filter (where property = 'Name') as name,
max(value) filter (where property = 'Age') as age,
max(value) filter (where property = 'Address') as addres
from the_table
group by "user";
A fundamental restriction of SQL is, that all columns of a query must be known to the database before it starts running that query.
There is no way you can have a "dynamic" number of columns (evaluated at runtime) in SQL.
Another alternative is to aggregate everything into a JSON value:
select "user",
jsonb_object_agg(property, value) as properties
from the_table
group by "user";

In TSQL, How do I add a count column that counts the number of rows in my query?

This can be done a number of ways, which I will explain at the end. For now, I have been given a work assignment that includes the following (simplified):
"Create a record each week to track the current status that has the following: account numbers (unique within each report), a random number (provided), their status (Green, Orange, or Blue), and make sure the record also has a column which tells me how many records their are this week."
I do not need code to generate a random number.
Columns: Account, RanNum, Status, NumberOfRowsThisWeek
How do I handle adding a column that determines the number of rows in my query and produces that number, static, within each row of that column?
I may try to tweak the request and apply a rising number. How would I go about doing it in this case?
Edit: SQL Server 2014
You are not telling us which database you are using.
In SQL Server, the newer versions at least, you have windowing function or analytical functions available, and they are also available in most other popular RDBMS
You could do what you want in SQL Server by adding this to your select
,count(*) over (partition by 1) as [NrOfRows]
An analytical function does the "standard" query, and then performs the windowing function on the result set.
The count above, counts the rows in the result set, partitioned by the constant 1, which is of course stable across all rows, so it gives the full rowcount.
It is perhaps not standard in all databases to allow a constant in that way, perhaps this would give a better result in some, I know it works in SQL Server:
,count(*) over (partition by (select 1 n)) as [NrOfRows]
it sounds like you want to do some kind of simple count() / group by query
select Account, RanNum, Status, count(*) as NumberOfRowsThisWeek
from tablename
group by Account, RanNum, Status
you my need to do
select Account, RanNum, Status, NumberOfRowsThisWeek
from (
select Account, Status, count(*) as NumberOfRowsThisWeek
from tablename
group by Account, Status
)
because the random number will confuse the group by by making every row unique.