SQL Table update based on values within table - tsql

I am looking to update rows in a table with values from other rows on the same table based on a common value.
In the table below I need to update the rows where 'Effective', 'End' and 'Enddate' are blank with the values from the row with the same IRN and FRN and where Name='[FCA CF] Functions requiring qualifications'.
So row 1 would inherit the data from row 2, row 3 from row 4, etc.
How can I do that please?
Thanks.

Looks like a regular update statement with a self join should do it.
Sample data
create table data
(
IRN int,
FRN int,
Name nvarchar(50),
Effective bit,
Ended bit,
EndDate date
);
insert into data (IRN, FRN, Name, Effective, Ended, EndDate) values
(100, 200, 'Something else?', null, null, null),
(100, 200, '[FCA CF] Functions requiring qualifications', 1, 1, '2020-12-06'),
(100, 200, 'Not what we are looking for', 0, 0, null),
(100, 300, 'Do not touch, different FRN', null, null, null);
Solution
update d1
set d1.Effective = d2.Effective,
d1.Ended = d2.Ended,
d1.EndDate = d2.EndDate
from data d1
join data d2
on d2.IRN = d1.IRN
and d2.FRN = d1.FRN
where d1.Effective is null
and d1.Ended is null
and d1.EndDate is null
and d2.Name = '[FCA CF] Functions requiring qualifications';
Result
IRN FRN Name Effective Ended EndDate
--- --- ------------------------------------------- --------- ----- ----------
100 200 Something else? True True 2020-12-06
100 200 [FCA CF] Functions requiring qualifications True True 2020-12-06
100 200 Not what we are looking for False False null
100 300 Do not touch, different FRN null null null
Fiddle to see it in action.

Related

Is it possible to look at the output of the previous row of a PostgreSQL query?

This is the question: Is it possible to look at the outputs, what has been selected, from the previous row of a running SQL query in Postgres?
I know that lag exists to look at the inputs, the "from" of the query. I also know that a CTE, subquery or lateral join can solve most issues of this kind. But I think the problem I'm facing genuinely requires a peek at the output of the previous row. Why? Because the output of the current row depends on a constant from a lookup table and the value used too look up that constant is an aggregate of all the previous rows. And if that lookup returns the wrong constant all subsequent rows will be increasingly off from the expected value.
The whole rest of this text is a simplified example based on the problem I'm facing. It should be possible to input it to PostgreSQL 12 and above and play around. I'm terribly sorry that it is as complicated as it is, but I think it is the most simple I can make it while still retaining the core issue: lookup in lookup table based on an aggregate from all previous rows as well as the fact that the "inventory" that's being tracked is modeled as a series of transactions of two discrete types.
The database itself exists to keep track of multiple fish farms, or cages full of fish. Fish can be moved/transferred from between these farms and the farms are fed about daily. Why not just carry the aggregate as a field in the table? Because it should be possible to switch out the lookup table after the season is over, to adjust it to better match with reality.
-- A listing of all groups of fish ever grown.
create table farms (
id bigserial primary key,
start timestamp not null,
stop timestamp
);
insert into farms
(id, start)
values (
1, '2021-02-01T13:37'
);
-- A transfer of fish from one odling to another.
-- If the source is null the fish is transferred from another fishery outside our system.
-- If the destination is null the fish is being slaughtered, removed from the system.
create table transfers (
source bigint references farms(id),
destination bigint references farms(id),
timestamp timestamp not null default current_timestamp,
total_weight_g bigint not null constraint positive_nonzero_total_weight_g check (total_weight_g > 0),
average_weight_g bigint not null constraint positive_nonzero_average_weight_g check (average_weight_g > 0),
number_fish bigint generated always as (total_weight_g / average_weight_g) stored
);
insert into transfers
(source, destination, timestamp, total_weight_g, average_weight_g)
values
(null, 1, '2021-02-01T16:38', 5, 5),
(null, 1, '2021-02-15T16:38', 500, 500);
-- Transactions of fish feed into a farm.
create table feedings (
id bigserial primary key,
growth_table bigint not null,
farm bigint not null references farms(id),
amount_g bigint not null constraint positive_nonzero_amunt_g check (amount_g > 0),
timestamp timestamp not null
);
insert into feedings
(farm, growth_table, amount_g, timestamp)
values
(1, 1, 1, '2021-02-02T13:37'),
(1, 1, 1, '2021-02-03T13:37'),
(1, 1, 1, '2021-02-04T13:37'),
(1, 1, 1, '2021-02-05T13:37'),
(1, 1, 1, '2021-02-06T13:37'),
(1, 1, 1, '2021-02-07T13:37');
create view combined_feed_and_transfer_history as
with transfer_history as (
select timestamp, destination as farm, total_weight_g, average_weight_g, number_fish
from transfers as deposits
where deposits.destination = 1 -- TODO: This view only works for one farm, fix that.
union all
select timestamp, source as farm, -total_weight_g, -average_weight_g, -number_fish
from transfers as withdrawals
where withdrawals.source = 1
)
select timestamp, farm, total_weight_g, number_fish, average_weight_g, null as growth_table
from transfer_history
union all
select timestamp, farm, amount_g, 0 as number_fish, 0 as average_weight_g, growth_table
from feedings
order by timestamp;
-- Conversion tables from feed to gained weight.
create table growth_coefficients (
growth_table bigserial not null,
average_weight_g bigint not null constraint positive_nonzero_weight check (average_weight_g > 0),
feed_conversion_rate double precision not null constraint positive_foderkonverteringsfaktor check (feed_conversion_rate >= 0),
primary key(growth_table, average_weight_g)
);
insert into growth_coefficients
(average_weight_g, feed_conversion_rate, growth_table)
values
(5.00,0.10,1),
(10.00,10.00,1),
(20.00,1.30,1),
(50.00,1.31,1),
(100.00,1.32,1),
(300.00,1.36,1),
(600.00,1.42,1),
(1000.00,1.50,1),
(1500.00,1.60,1),
(2000.00,1.70,1),
(2500.00,1.80,1),
(3000.00,1.90,1),
(4000.00,2.10,1),
(5000.00,2.30,1);
-- My current solution is a bad one. It does a CTE that sums over all events but does not account
-- for the feed conversion rate. That means that the average weight used too look up the feed
-- conversion rate will diverge more and more from reality the further into the season time goes.
-- This is why it is important to look at the output, the average weight, of the previous row.
-- We start by summing up all the transfer and feed events to get a rough average_weight_g.
with estimate as (
select
timestamp,
farm,
total_weight_g as transaction_size_g,
growth_table,
sum(total_weight_g) over (order by timestamp) as sum_weight_g,
sum(number_fish) over (order by timestamp) as sum_number_fish,
sum(total_weight_g) over (order by timestamp) / sum(number_fish) over (order by timestamp) as average_weight_g
from
combined_feed_and_transfer_history
)
select
timestamp,
sum_number_fish,
transaction_size_g as trans_g,
sum_weight_g,
closest_lookup_table_weight.average_weight_g as lookup_g,
converted_weight_g as conv_g,
sum(converted_weight_g) over (order by timestamp) as sum_conv_g,
sum(converted_weight_g) over (order by timestamp) / sum_number_fish as sum_average_g
from
estimate
join lateral ( -- We then use this estimated_average_weight to look up the closest constant in the growth coefficient table.
(select gc.average_weight_g - estimate.average_weight_g as diff, gc.average_weight_g from growth_coefficients gc where gc.average_weight_g >= estimate.average_weight_g order by gc.average_weight_g asc limit 1)
union all
(select estimate.average_weight_g - gc.average_weight_g as diff, gc.average_weight_g from growth_coefficients gc where gc.average_weight_g <= estimate.average_weight_g order by gc.average_weight_g desc limit 1)
order by diff
limit 1
) as closest_lookup_table_weight
on true
join lateral ( -- If the historical event is a feeding we need to lookup the feed conversion rate.
select case when growth_table is null then 1
else (select feed_conversion_rate
from growth_coefficients gc
where gc.growth_table = growth_table
and gc.average_weight_g = closest_lookup_table_weight.average_weight_g)
end
) as growth_coefficient
on true
join lateral (
select feed_conversion_rate * transaction_size_g as converted_weight_g
) as converted_weight_g
on true;
At the very bottom is my current "solution". With the above example data the sum_conv_g should end up being 5.6, but due to the aggregate being used as the lookup not accounting for the conversion rate the sum_conv_g ends up 45.2 instead.
One idea I had was if there perhaps something like query-local variables one could use to store the sum_average_g between rows? There's always the escape hatch of just querying out the transactions to my generic programming language Clojure and solving it there, but it would be neat if it could be solved entirely within the database.
You have to formulate a recursive subquery. I posted a simplified version of this question over at the DBA SE and got the answer there. The answer to that question can be found here and can be expanded to this more complicated question, though I would wager that no one will ever have the interest to do that.

How to show chain element by order

I have goal to create query which return me item ids regarding position in chain.
I have chain logic, each element has right and left fk and index.
Chain can contains elements which can added like append and like prepend approach, regarding this id from table not help to build current chain dependencies.
This is db structure
create table public.chain_data
(
id integer not null
constraint chain_data_pkey
primary key,
unique_identifiers_id integer not null
constraint fk_388447e52a0b191e
references public.unique_identifiers
on delete cascade,
chain_data_name varchar(255) not null,
carriage boolean default false,
left_id integer not null
constraint fk_388447e5e26cce02
references public.chain_data,
right_id integer
constraint fk_388447e554976835
references public.chain_data
);
alter table public.chain_data
owner to "universal-counter";
create index idx_388447e52a0b191e
on public.chain_data (unique_identifiers_id);
create unique index left_right_uniq_idx
on public.chain_data (right_id, left_id);
create unique index carriage_uniq_index
on public.chain_data (unique_identifiers_id, carriage)
where (carriage <> false);
and data example. this chain began from id = 10 and then was prepend new items(rows) in start of chain. Each element has left and right dependencies. So inserts:
INSERT INTO public.chain_data (id, unique_identifiers_id, chain_data_name, carriage, left_id, right_id)
VALUES
(10, 8, 'dddd_2', true, 22, null),
(22, 8, 'shuba', false, 23, 10),
(24, 8, 'viktor', false, null, 23),
(23, 8, 'ivan', false, 24, 22);
Regarding this query should to return ids like this
24, 23, 22, 10
because element with id = 24 present on start chain then by left and right dependencies obviously 23, 22 and 10 id= 10 is last element in chain
demo:db<>fiddle
You can use a recursive CTE for that:
WITH RECURSIVE chain AS (
SELECT id, right_id -- 1
FROM chain_data
WHERE left_id IS NULL
UNION
SELECT cd.id, cd.right_id -- 2
FROM chain_data cd
JOIN chain c ON c.right_id = cd.id
)
SELECT
string_agg(id::text, ', ') -- 3
FROM
chain
Initial part of the recursion: The record with the NULL value
The recursion part: Join the current table on the previous step using the previous right_id as current id
Afterwards you can aggregate all fetched records with the string_agg() aggregation to return your string list.

How do I insert multiple values into PG with different where clauses?

Here is how my data is set up:
Zone
------
zone_id
value
other_id
I am going to have the same zone_id for all of the updates. I am only updating the value, but where the other_id is different in each case.
ie: zone_id: 1, [{value: 10, other_id: 12}, {value: 40, other_id: 17}, ...]
I want to do this all in one statement.
UPDATE zone set value = {value} where zone_id = {id} and other_id = {other}, but I want to set multiple values in the same statement.
How do I do this? Is this possible?
I'm not 100% clear, but I think you are asking to set the value conditionally. So something like this should work:
UPDATE zone
SET value = CASE other_id
WHEN 12 THEN 10
WHEN 17 THEN 40
-- and so on
ELSE value
END
WHERE zone_id = 1

Constraint on sum from rows

I've got a table in PostgreSQL 9.4:
user_votes (
user_id int,
portfolio_id int,
car_id int
vote int
)
Is it possible to put a constraint on the table so a user max can have 99 point to vote with in each portfolio?
This means that a user can have multiple rows consisting of the same user_id and portfolio_id, but different car_id and vote. The sum on votes should never exceed 99, but it can be placed among different cars.
So doing:
INSERT INTO user_vores (user_id, portfolio_id, car_id, vote) VALUES
(1, 1, 1, 20),
(1, 1, 7, 40),
(1, 1, 9, 25)
would all be allowed, but when trying to add something that exceeds 99 votes should fail, like another row:
INSERT INTO user_vores (user_id, portfolio_id, car_id, vote) VALUES
(1, 1, 21, 40)
Unfortunately no, if you tried to create such a constraint you will see this error message:
ERROR: aggregate functions are not allowed in check constraints
But the wonderfull thing about postgresql is that there is always more than one way to skin a cat. You can use a BEFORE trigger to check that the data you are trying to insert fullfills our requirements.
Row-level triggers fired BEFORE can return null to signal the trigger
manager to skip the rest of the operation for this row (i.e.,
subsequent triggers are not fired, and the INSERT/UPDATE/DELETE does
not occur for this row). If a nonnull value is returned then the
operation proceeds with that row value.
Inside your trigger you would count the number of votes
SELECT COUNT(*) into vote_count FROM user_votes WHERE user_id = NEW.user_id
Now if vote_count is 99 you return NULL and the data will not be inserted.

Comparing 2 tables for new or updated rows using composite keys

I'm writing tsql for SQL Server 2008. I've got two tables with roughly 2 million rows each. The Source table gets updated daily and changes are pushed to the Destination table based on a last_edit date. If this date is newer in source than destination then update the destination row. If a new row exists in source compared to destination insert it into destination. This is really only a one way process that I'm concerned with, from source to destination. The source and destination table use a unique identifier across 4 columns, serialid, itemid, systemcode, and role.
My table are modeled similar to the script below. There are many data columns but I've limited it to 3 in this example. I'm looking for 2 outputs. 1 set of data with rows to update and 1 set of data with rows to add.
CREATE TABLE [dbo].[TABLE_DEST](
[SERIALID] [nvarchar](20) NOT NULL,
[ITEMID] [nvarchar](20) NOT NULL,
[SYSTEMCODE] [nvarchar](20) NOT NULL,
[ROLE] [nvarchar](10) NOT NULL,
[LAST_EDIT] [datetime] NOT NULL],
[DATA_COLUMN_1] [nvarchar](10) NOT NULL,
[DATA_COLUMN_2] [nvarchar](10) NOT NULL,
[DATA_COLUMN_3] [nvarchar](10) NOT NULL
)
CREATE TABLE [dbo].[TABLE_SOURCE](
[SERIALID] [nvarchar](20) NOT NULL,
[ITEMID] [nvarchar](20) NOT NULL,
[SYSTEMCODE] [nvarchar](20) NOT NULL,
[ROLE] [nvarchar](10) NOT NULL,
[LAST_EDIT] [datetime] NOT NULL],
[DATA_COLUMN_1] [nvarchar](10) NOT NULL,
[DATA_COLUMN_2] [nvarchar](10) NOT NULL,
[DATA_COLUMN_3] [nvarchar](10) NOT NULL
)
Here's what I've got for the update dataset.
select s.*
from table_dest (nolock) inner join table_source s (nolock)
on s.SYSTEMCODE = fd.SYSTEMCODE1Y
and s.ROLE = d.ROLE
and s.SERIALID = d.SERIALID
and s.ITEMID = d.ITEMID
and s.LAST_EDIT > d.LAST_EDIT
I don't know how best to accomplish finding the rows to add. But the solution has to be pretty efficient for the database.
Unmatched rows can be found with left/right join and checking target table keys for null:
select s.*, case when d.key1 is null then 'insert' else 'update' end [action]
from [table_dest] d right join [table_source] s on (d.key1 = s.key1 /* etc.. */)
If you need these rows just to perform respective operations then there is special feature for you:
merge [table_dest] d
using [table_source] s on (d.key1 = s.key1 /* etc.. */)
when mathed then
update set d.a = s.a
when not matched by target then
insert (key1, .., a) values (s.key1, ..., s.a);