Postgres function to attach random integers to selected rows - postgresql

I want a function or a trigger that when a row is inserted that all rows with matching criteria are given a random integer between 1 and the number of rows so to randomise the rows on a select.
E.g. if I have the data
Col1 Col2 Order
A 1
B 2
B 2
B 3
A 2
and I insert another row with Col1=B and Col2=2 then I want to end up with
Col1 Col2 Order
A 1
B 2 2
B 2 3
B 3
A 2
B 2 1
Where Order is a number with a value of 1 - with each number appearing only once?

There is no need to store this, you can generate such a number when you retrieve the data.
select col1,
col2,
row_number() over (partition by col1, col2 order by random()) as random_order
from the_table

Related

SQL Where Clause with multiple and joined conditions

In the below table, I want to write a SQL query to exclude a row when col1=2 and col2=1. Only when both conditions are met, I want to drop that column. If col1=2 but col2<>1 then I want to keep that row.
col1
col2
1
2
1
2
2
2
2
1
I am trying this below snippet but it's not working:
select *
from table
where (col1<>2 and col2<>1)
You can use either:
WHERE NOT (col1 <> 2 AND col2 <> 1)
or
WHERE (col1 <> 2 OR col2 <> 1)
You should use or instead of and.

Spark use self reference in calculation for column

I have a data frame like this one given below. Essentially it is a time series derived data frame.
My issue is that the Formula for n-th Row Col C is :-
Col(C) = (Col A(nth row) - Col A(n-1 th row)) + Col C(n-1)th row.
Hence Calculation of Col C is self referencing a previous value of Col C. I am using spark sql, can some one please advise how to proceed with this? For the calculation of Col A I am using LAG function
It seems colC is just colA minus colA in the first row.
e.g.
1 = 6-5,
0 = 5-5,
2 = 7-5,
3 = 8-5,
-2 = 3-5
So this query should work:
SELECT colA, colA - FIRST(colA) OVER (ORDER BY id) AS colC
Your formula is a cumulative sum. Here is a complete example:
SELECT rowid, a, SUM(c0) OVER(ORDER BY rowid) as c
FROM
(
SELECT rowid, a, a - LAG(a, 1) OVER(ORDER BY rowid) as c0
FROM
(
SELECT 1 as rowid, 5 as a union all
SELECT 2 as rowid, 6 as a union all
SELECT 3 as rowid, 5 as a union all
SELECT 4 as rowid, 7 as a union all
SELECT 5 as rowid, 8 as a union all
SELECT 6 as rowid, 3 as a
)t
)t

filter rows from Postgres table based on specific conditions without missing relevant rows

I have table with following columns in postgres.
col1 col2 col3
1 Other a
2 Drug b
1 Procedure c
3 Combination Drug d
4 Biological e
3 Behavioral f
3 Drug g
5 Drug g
6 Procedure h
I would like to filter rows based on following filters.
select col1, col2, col3
from tbl
where col2 in ('Other', 'Drug', 'Combination Drug', 'Biological')
But this query will exclude below rows
1 Procedure c
3 Behavioral f
The Desired output is:
col1 col2 col3
1 Other a
1 Procedure c
2 Drug b
3 Combination Drug d
3 Behavioral f
3 Drug g
4 Biological e
5 Drug g
How can I achieve the above output without missing the mentioned rows.
Any suggestion here will be really helpful. Thanks
I think you want the rows where there is as col1 containing any of the values of col2 in the list:
select col1, col2, col3
from tbl
where col1 in (
select col1 from tbl
where col2 in ('Other', 'Drug', 'Combination Drug', 'Biological')
)
order by col1;
Or with EXISTS:
select t.col1, t.col2, t.col3
from tbl t
where exists (
select 1 from tbl
where col1 = t.col1
and col2 in ('Other', 'Drug', 'Combination Drug', 'Biological')
)
order by col1;
See the demo.

how to join two tables without repetation or the cells from second table in postgresql using PLSQL

When I try to join the below two table
I am not able to get the output I want by the join.
I tried using join but it didn't work let me know if its possible with plsql
Table 1:
col1 col2
1 a
1 b
1 c
2 a
2 b
3 a
table 2:
col1 col2
1 x
1 y
2 x
2 y
3 x
3 y
The output must be:
col1 col2 col3
1 a x
1 b y
1 c
2 a x
2 b y
3 a x
3 y
If use the join I am not able to get the same output as above.
The output I am getting is
1 a x
1 a y
1 b x
1 b y
1 c x
1 c y
2 a x
.....
.....
3 a x
3 a y
What you are searching is called a FULL OUTER JOIN. The result of this join contains elements from both input-tables, matching records get combined.
You can find more information here: https://stackoverflow.com/questions/4796872/full-outer-join-in-mysql
Using Window functions, specifically ROW_NUMBER() and partitioning by the Col1 in both tables, we can get a partitioned row_number that can be used as part of the join.
In other words, it seems to me that the order that the records are in is crucial for the join and result set you are desiring. Furthermore, using #Benvorth's suggestion of a FULL OUTER JOIN to achieve the NULLs in both direction.. I believe this might work:
SELECT
COALESCE(t1.col1,t2.col1) as col1,
t1.col2,
t2.col2
FROM
(SELECT col1, col2, ROW_NUMBER() OVER (PARTITION BY col1 ORDER BY col2 ASC) as col1_row_number FROM table1) t1
FULL OUTER JOIN
(SELECT col1, col2, ROW_NUMBER() OVER (PARTITION BY col1 ORDER BY col2 ASC) as col1_row_number FROM table2) t2 ON
t1.col1 = t2.col1 AND
t1.col1_row_number = t2.col1_row_number
That ROW_NUMBER() OVER (PARTITION BY col1, ORDER BY col2 ASC) bit will create row number for each record. The row_number will restart back at 1 for each new col1 value encountered. You can think of it like a RANK for each distinct Col1 value based on Col2's value. Table1's output from the subquery SELECT col1, col2, ROW_NUMBER() OVER (PARTITION BY col1 ORDER BY col2 ASC) as col1_row_number FROM table1 will look like:
Table 1:
col1 col2 col1_row_number
1 a 1
1 b 2
1 c 3
2 a 1
2 b 2
3 a 1
So we do that with both tables, then we use that row number as part of the join along with col1.
A sqlfiddle showing this matching your desired result from the question

Tsql query to find equal row values along columns

I've this table
col 1 col 2 col 3 .... col N
-------------------------------------
1 A B fooa
10 A foo cc
4 A B fooa
it is possible with a tsql query to return only one row with a value only where the values are ALL the same?
col 1 col 2 col 3 .... col N
-------------------------------------
-- A -- --
SELECT
CASE WHEN COUNT(col1) = COUNT(*) AND MIN(col1) = MAX(col1) THEN MIN(col1) END AS col1,
CASE WHEN COUNT(col2) = COUNT(*) AND MIN(col2) = MAX(col2) THEN MIN(col2) END AS col2,
...
FROM yourtable
You have to allow for NULLs in the column:
COUNT(*) counts them
COUNT(col1) doesn't count them
That is, a columns with a mix of As and NULLs isn't one value. MIN and MAX would both give A because they ignore NULLs.
Edit:
removed DISTINCT to get counts the same for NULL check
added MIN/MAX check (as per Mark Byers deleted answer) to check uniqueness