find rows not following by the same values in 3 columns - postgresql

I have a table named raw_data with the following data
as You can see id 1 and 2 share the same values in field desa, kecamatan and kabupaten, also id 3,4,5.
So basically I want to select all rows that is not followed by the same previous values. expected result would be:
I know it's easy to do this in any programming languages such as PHP, but I need this in postgresql. is this doable? Thanks in Advance.

Assuming higher id denotes latest row, if a row with same all three columns is present not together and you don't want to filter it out as it doesn't have same values as previous row (order by id or created_date), then you can make use of analytic lag() function:
select *
from (
select
t.*,
case
when desa = lag(desa) over (order by id)
and kecamatan = lag(kecamatan) over (order by id)
and kabupaten = lag(kabupaten) over (order by id)
then 0 else 1
end flag
from your_table t
) t where flag = 1;

Related

PostgreSQL function results in inconsistent number of rows when used in WHERE clause

I have a function "experiments.get_random_experiment_group(experiment)" that when called in isolation behaves normally, but when called in a WHERE clause results in zero or more rows being returned.
For example, if I call this function like so:
SELECT
experiments.get_random_experiment_group('1a8cd547-8de6-4602-9103-3cea12f57365');
it will always return 1 of the 2 groups ids associated with that experiment.
But if I do something like this...:
SELECT
id,
experiment_id,
proportion
FROM
experiments."group"
WHERE
id = experiments.get_random_experiment_group('1a8cd547-8de6-4602-9103-3cea12f57365');
I will sometimes get 1 row, 2 rows, or no rows at all.
Why is it behaving as expected in the first case, but not in the second? Any suggestions as to what's happening?
Here is the function:
CREATE OR REPLACE FUNCTION experiments.get_random_experiment_group (experiment uuid)
RETURNS uuid
LANGUAGE plpgsql
AS
$function$
DECLARE
result uuid;
BEGIN
WITH
CTE
AS
(
SELECT
random() * (SELECT SUM(g.proportion) FROM experiments."group" g WHERE g.experiment_id = experiment) AS r
)
SELECT
id
INTO
result
FROM
(
SELECT
id,
SUM(proportion) OVER (ORDER BY id) AS s,
r
FROM
(
SELECT
*
FROM
experiments."group" g
WHERE
g.experiment_id = experiment
) a
CROSS JOIN
CTE
) q
WHERE
s >= r
ORDER BY
id
LIMIT
1;
return result;
end;
$function$;
The purpose of this function is to randomly select a group id for a given experiment id. The probability of selecting a particular group is given by the “proportion” assigned to that group.
And here are all of the groups associated with the experiment in my example:
id
experiment_id
proportion
95cc9e77-3dff-48d9-91cb-3e17c0c24d5a
1a8cd547-8de6-4602-9103-3cea12f57365
0.5
47469555-88fd-4cec-925f-9cf15b91bca0
1a8cd547-8de6-4602-9103-3cea12f57365
0.5
I am using postgres version 13.4
The function was getting called for each row in the "group" table, and thus returning rows when and only when the group id of that row happened to correspond to the randomly generated id returned by the function call.
Thanks to the commenters for pointing this out.

SQL query to put a number in a column and put an incremented number when there is a new text in a column

I have a query SELECT * from TABLE which gives the result as below table:
Expected column is as below:
I want to frame a new column like whenever we get the value as 0 then the number should be incremented by 1. I tried DENSE_RANK() , ROW_NUMBER() but couldn't get the exact result which mentioned. Is that possible in PostgreSQL.
Try This:
select name, value,
sum(case when value=0 then 1 else 0 end) over (order by "sno")
from (
select row_number() over() as "sno",* from example
) tab
DEMO
NOTE: Please note that there is no guaranteed that you will get same output always due no ordering field in your raw data.
So Better approach is to add some field in your view output by which it can be ordered and run the query like below:(assuming you have a ID field)
select
name,
value,
sum(case when value=0 then 1 else 0 end) over (order by id)
from example
DEMO

How to reference output rows with window functions?

Suppose I have a table with quantity column.
CREATE TABLE transfers (
user_id integer,
quantity integer,
created timestamp default now()
);
I'd like to iteratively go thru a partition using window functions, but access the output rows, not the input table rows.
To access the input table rows I could do something like this:
SELECT LAG(quantity, 1, 0)
OVER (PARTITION BY user_id ORDER BY created)
FROM transfers;
I need to access the previous output row to calculate the next output row. How can i access the lag row in the output? Something like:
CREATE VIEW balance AS
SELECT LAG(balance.total, 1, 0) + quantity AS total
OVER (PARTITION BY user_id ORDER BY created)
FROM transfers;
Edit
This is a minimal example to support the question of how to access the previous output row within a window partition. I don't actually want a sum.
It seems you attempt to calculate a running sum. Luckily that's just what Sum() window function does:
WITH transfers AS(
SELECT i, random()-0.3 AS quantity FROM generate_series(1,100) as i
)
SELECT i, quantity, sum(quantity) OVER (ORDER BY i) from transfers;
I guess, looking at the question, that the only you need is to calculate a cumulative sum.
To calculate a cumulative summ use this query:
SELECT *,
SUM( CASE WHEN quantity IS NULL THEN 0 ELSE quantity END)
OVER ( PARTITION BY user_id ORDER BY created
ROWS BETWEEN unbounded preceding AND current row
) As cumulative_sum
FROM transfers
ORDER BY user_id, created
;
But if you want more complex calculations, especially containing some conditions (decisions) that depend on a result from prevoius row, then you need a recursive approach.

query for a range of records in result

I am wondering if there is some easy way, a function, or other method to return data from a query with the following results.
I have a SQL Express DB 2008 R2, a table that contains numerical data in a given column, say col T.
I am given a value X in code and would like to return up to three records. The record where col T equals my value X, and the record before and after, and nothing else. The sort is done on col T. The record before may be beginning of file and therefore not exist, likewise, if X equals the last record then the record after would be non existent, end of file/table.
The value of X may not exist in the table.
This I think is similar to get a range of results in numerical order.
Any help or direction in solving this would be greatly appreciated.
Thanks again,
It might not be the most optimal solution, but:
SELECT T
FROM theTable
WHERE T = X
UNION ALL
SELECT *
FROM
(
SELECT TOP 1 T
FROM theTable
WHERE T > X
ORDER BY T
) blah
UNION ALL
SELECT *
FROM
(
SELECT TOP 1 T
FROM theTable
WHERE T < X
ORDER BY T DESC
) blah2
DECLARE #x int = 100
;WITH t as
(
select ROW_NUMBER() OVER (ORDER BY T ASC) AS row_nm,*
from YourTable
)
, t1 as
(
select *
from t
WHERE T = #x
)
select *
from t
CROSS APPLY t1
WHERE t.row_nm BETWEEN t1.row_nm -1 and t1.row_nm + 1

t-sql return multiple rows depending on field value

i am trying to run an export on a system that only allows t-sql. i know enough of php to make a foreach loop, but i don't know enough of t-sql to generate multiple rows for a given quantity.
i need a result to make a list of items with "1 of 4" like data included in the result
given a table like
orderid, product, quantity
1000,ball,3
1001,bike,4
1002,hat,2
how do i get a select query result like:
orderid, item_num, total_items,
product
1000,1,3,ball
1000,2,3,ball
1000,3,3,ball
1001,1,4,bike
1001,2,4,bike
1001,3,4,bike
1001,4,4,bike
1002,1,2,hat
1002,2,2,hat
You can do this with the aid of an auxiliary numbers table.
;WITH T(orderid, product, quantity) AS
(
select 1000,'ball',3 union all
select 1001,'bike',4 union all
select 1002,'hat',2
)
SELECT orderid, number as item_num, quantity as total_items, product
FROM T
JOIN master..spt_values on number> 0 and number <= quantity
where type='P'
NB: The code above uses the master..spt_values table - this is just for demo purposes I suggest you create your own tally table using one of the techniques here.
If you are on SQL Server 2005 or later version, then you can try a recursive CTE instead of a tally table.
;WITH CTE AS
(
SELECT orderid, 1 item_num, product, quantity
FROM YourTable
UNION ALL
SELECT orderid, item_num+1, product, quantity
FROM CTE
WHERE item_num < quantity
)
SELECT *
FROM CTE
OPTION (MAXRECURSION 0)
I'm not on a computer with a database engine where I can test this, so let me know how it goes.
Well, IF you know the maximum value for the # of products for any product (and it's not too big, say 4), you can:
Create a helper table called Nums containing 1 integer column n, with rows containing 1,2,3,4
Run
SELECT * from Your_table, Nums
WHERE Nums.n <= Your_table.quantity