Computed table column with MAX value between rows containing a shared value - firebird

I have the following table
CREATE TABLE T2
( ID_T2 integer NOT NULL PRIMARY KEY,
FK_T1 integer, <--- foreign key to T1(Table1)
FK_DATE date, <--- foreign key to T1(Table1)
T2_DATE date, <--- user input field
T2_MAX_DIFF COMPUTED BY ( (SELECT DATEDIFF (day, MAX(T2_DATE), CURRENT_DATE) FROM T2 GROUP BY FK_T1) )
);
I want T2_MAX_DIFF to display the number of days since last input across all similar entries with a common FK_T1.
It does work, but if another FK_T1 values is added to the table, I'm getting an error about "multiple rows in singleton select".
I'm assuming that I need some sort of WHERE FK_T1 = FK_T1 of corresponding row. Is it possible to add this? I'm using Firebird 3.0.7 with flamerobin.

The error "multiple rows in singleton select" means that a query that should provide a single scalar value produced multiple rows. And that is not unexpected for a query with GROUP BY FK_T1, as it will produce a row per FK_T1 value.
To fix this, you need to use a correlated sub-query by doing the following:
Alias the table in the subquery to disambiguate it from the table itself
Add a where clause, making sure to use the aliased table (e.g. src, and src.FK_T1), and explicitly reference the table itself for the other side of the comparison (e.g. T2.FK_T1)
(optional) remove the GROUP BY clause because it is not necessary given the WHERE clause. However, leaving the GROUP BY in place may uncover certain types of errors.
The resulting subquery then becomes:
(SELECT DATEDIFF (day, MAX(src.T2_DATE), CURRENT_DATE)
FROM T2 src
WHERE src.FK_T1 = T2.FK_T1
GROUP BY src.FK_T1)
Notice the alias src for the table referenced in the subquery, the use of src.FK_T1 in the condition, and the explicit use of the table in T2.FK_T1 to reference the column of the current row of the table itself. If you'd use src.FK_T1 = FK_T1, it would compare with the FK_T1 column of src (as if you'd used src.FK_T1 = src.FK_T2), so that would always be true.
CREATE TABLE T2
( ID_T2 integer NOT NULL PRIMARY KEY,
FK_T1 integer,
FK_DATE date,
T2_DATE date,
T2_MAX_DIFF COMPUTED BY ( (
SELECT DATEDIFF (day, MAX(src.T2_DATE), CURRENT_DATE)
FROM T2 src
WHERE src.FK_T1 = T2.FK_T1
GROUP BY src.FK_T1) )
);

Related

postgresql group by datetime in join query

I have 2 tables in my postgresql timescaledb database (version 12.06) that I try to query through inner join.
Tables' structure:
CREATE TABLE currency(
id serial PRIMARY KEY,
symbol TEXT NOT NULL,
name TEXT NOT NULL,
quote_asset TEXT
);
CREATE TABLE currency_price (
currency_id integer NOT NULL,
dt timestamp WITHOUT time ZONE NOT NULL,
open NUMERIC NOT NULL,
high NUMERIC NOT NULL,
low NUMERIC NOT NULL,
close NUMERIC,
volume NUMERIC NOT NULL,
PRIMARY KEY (
currency_id,
dt
),
CONSTRAINT fk_currency FOREIGN KEY (currency_id) REFERENCES currency(id)
);
The query I'm trying to make is:
SELECT currency_id AS id, symbol, MAX(close) AS close, DATE(dt) AS date
FROM currency_price
JOIN currency ON
currency.id = currency_price.currency_id
GROUP BY currency_id, symbol, date
LIMIT 100;
Basically, it returns all the rows that exist in currency_price table. I know that postgres doesn't allow select columns without an aggregate function or including them in "group by" clause. So, if I don't include dt column in my select query, i receive expected results, but if I include it, the output shows rows of every single day of each currency while I only want to have the max value of every currency and filter them out based on various dates afterwards.
I'm very inexperienced with SQL in general.
Any suggestions to solve this would be very appreciated.
There are several ways to do it, easiest one comes to mind is using window functions.
select *
from (
SELECT currency_id,symbol,close,dt
,row_number() over(partition by currency_id,symbol
order by close desc,dt desc) as rr
FROM currency_price
JOIN currency ON currency.id = currency_price.currency_id
where dt::date = '2021-06-07'
)q1
where rr=1
General window functions:
https://www.postgresql.org/docs/9.5/functions-window.html
works also with standard aggregate functions like SUM,AVG,MAX,MIN and others.
Some examples: https://www.postgresqltutorial.com/postgresql-window-function/

How can I restrict a result to only include rows where one specific field is unique with UNION Select statement in BigQuery?

I have the following code. I try to stitch the two tables together, but restrict it to only add duplicate Opportunity_ID once, and then from the second table (OpportunitiesUpdates).
SELECT
Opportunity.Account_Name,
Opportunity.Opportunity_Name,
Opportunity.Opportunity_Owner,
Opportunity.Opportunity_ID
FROM
Opportunity
UNION DISTINCT
SELECT
OpportunityUpdates.Account_Name,
OpportunityUpdates.Opportunity_Name,
OpportunityUpdates.Opportunity_Owner,
OpportunityUpdates.Opportunity_ID
FROM
OpportunityUpdates
WHERE OpportunityUpdates.Opportunity_ID <> Opportunity.Opportunity_ID
This code consolidates all records from both tables (by Opportunity_ID) and gives priority to the OpportunityUpdates table based on Opportunity_ID.
It assumes that the same Opportunity_ID could be in either table ("duplicates"), but that within each table an Opportunity_ID is unique. It also assumes that Opportunity_ID is not nullable (never null).
SELECT DISTINCT
IF(ou.Opportunity_ID IS NOT NULL, ou.Account_Name, o.Account_Name) Account_Name,
IF(ou.Opportunity_ID IS NOT NULL, ou.Opportunity_Name, o.Opportunity_Name) Opportunity_Name,
IF(ou.Opportunity_ID IS NOT NULL, ou.Opportunity_Owner, o.Opportunity_Owner) Opportunity_Owner,
COALESCE(ou.Opportunity_ID, o.Opportunity_ID) Opportunity_ID
FROM OpportunityUpdates ou
FULL OUTER JOIN
Opportunity o
ON o.Opportunity_ID = ou.Opportunity_ID

Use sum function in calculated column

Is it possible to use a sum function in a calculated column?
If yes, I would like to create a calculated column, that calculates the sum of a column in the same table where the date is smaller than the date of this entry. is this possible?
And last, would this optimize repeated calls on this value over the exemplified view below?
SELECT ProductGroup, SalesDate, (
SELECT SUM(Sales)
FROM SomeList
WHERE (ProductGroup= KVU.ProductGroup) AND (SalesDate<= KVU.SalesDate)) AS cumulated
FROM SomeList AS KVU
Is it possible to use a sum function in a calculated column?
Yes, it's possible using a scalar valued function (scalar UDF) for you computed column but this would be a disaster. Using scalar UDFs for computed columns destroy performance. Adding a scalar UDF that accesses data (which would be required here) makes things even worse.
It sounds to me like you just need a good ol' fashioned index to speed things up. First some sample data:
IF OBJECT_ID('dbo.somelist','U') IS NOT NULL DROP TABLE dbo.somelist;
GO
CREATE TABLE dbo.somelist
(
ProductGroup INT NOT NULL,
[Month] TINYINT NOT NULL CHECK ([Month] <= 12),
Sales DECIMAL(10,2) NOT NULL
);
INSERT dbo.somelist
VALUES (1,1,22),(2,1,45),(2,1,25),(2,1,19),(1,2,100),(1,2,200),(2,2,50.55);
and the correct index:
CREATE NONCLUSTERED INDEX nc_somelist ON dbo.somelist(ProductGroup,[Month])
INCLUDE (Sales);
With this index in place this query would be extremely efficient:
SELECT s.ProductGroup, s.[Month], SUM(s.Sales)
FROM dbo.somelist AS s
GROUP BY s.ProductGroup, s.[Month];
If you needed to get a COUNT by month & product group you could create an indexed view like so:
CREATE VIEW dbo.vw_somelist WITH SCHEMABINDING AS
SELECT s.ProductGroup, s.[Month], TotalSales = COUNT_BIG(*)
FROM dbo.somelist AS s
GROUP BY s.ProductGroup, s.[Month];
GO
CREATE UNIQUE CLUSTERED INDEX uq_cl__vw_somelist ON dbo.vw_somelist(ProductGroup, [Month]);
Once that indexed view was in place your COUNTs would be pre-aggregated. You cannot, however, include SUM in an indexed view.

Another way of returning rows if any of the columns has different value for the same id

Is there any other way for returning rows for the same id by joining two tables and return the row if any of the columns value for the same id is different.
Select Table1.No,Table2.No,Table1.Name,Table2.Name,Table1.ID,Table2.ID,Table1.ID_N,Table2.ID_N
From MyFirstTable Table1
JOIN MySecondTable Table2
ON Table1.No=Table2.No where Table1.ID!=Table2.ID or Table1.ID_N != Table2.ID_N
In the example above , I have only two columns I need to check but in my real case there are at least 20 .
Is there any other statment I can use instead of enumerating each column in the where codition?
...WHERE BINARY_CHECKSUM(Table1.*) <> BINARY_CHECKSUM(Table2.*)
or
...WHERE BINARY_CHECKSUM(Table1.Field1, Table1.Field2, ...) <> BINARY_CHECKSUM(Table2..Field1, Table2.Field2, ...)
*this assumes you have no blob fields in your tables
http://technet.microsoft.com/en-us/library/ms173784.aspx
If No is a PK
Select Table1.No,Table1.Name,Table1.ID,Table1.ID_N
From MyFirstTable Table1
except
Select Table1.No,Table1.Name,Table1.ID,Table1.ID_N
From MySecondTable Table1

an empty row with null-like values in not-null field

I'm using postgresql 9.0 beta 4.
After inserting a lot of data into a partitioned table, i found a weird thing. When I query the table, i can see an empty row with null-like values in 'not-null' fields.
That weird query result is like below.
689th row is empty. The first 3 fields, (stid, d, ticker), are composing primary key. So they should not be null. The query i used is this.
select * from st_daily2 where stid=267408 order by d
I can even do the group by on this data.
select stid, date_trunc('month', d) ym, count(*) from st_daily2
where stid=267408 group by stid, date_trunc('month', d)
The 'group by' results still has the empty row.
The 1st row is empty.
But if i query where 'stid' or 'd' is null, then it returns nothing.
Is this a bug of postgresql 9b4? Or some data corruption?
EDIT :
I added my table definition.
CREATE TABLE st_daily
(
stid integer NOT NULL,
d date NOT NULL,
ticker character varying(15) NOT NULL,
mp integer NOT NULL,
settlep double precision NOT NULL,
prft integer NOT NULL,
atr20 double precision NOT NULL,
upd timestamp with time zone,
ntrds double precision
)
WITH (
OIDS=FALSE
);
CREATE TABLE st_daily2
(
CONSTRAINT st_daily2_pk PRIMARY KEY (stid, d, ticker),
CONSTRAINT st_daily2_strgs_fk FOREIGN KEY (stid)
REFERENCES strgs (stid) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT st_daily2_ck CHECK (stid >= 200000 AND stid < 300000)
)
INHERITS (st_daily)
WITH (
OIDS=FALSE
);
The data in this table is simulation results. Multithreaded multiple simulation engines written in c# insert data into the database using Npgsql.
psql also shows the empty row.
You'd better leave a posting at http://www.postgresql.org/support/submitbug
Some questions:
Could you show use the table
definitions and constraints for the
partions?
How did you load your data?
You get the same result when using
another tool, like psql?
The answer to your problem may very well lie in your first sentence:
I'm using postgresql 9.0 beta 4.
Why would you do that? Upgrade to a stable release. Preferably the latest point-release of the current version.
This is 9.1.4 as of today.
I got to the same point: "what in the heck is that blank value?"
No, it's not a NULL, it's a -infinity.
To filter for such a row use:
WHERE
case when mytestcolumn = '-infinity'::timestamp or
mytestcolumn = 'infinity'::timestamp
then NULL else mytestcolumn end IS NULL
instead of:
WHERE mytestcolumn IS NULL