Let's consider I have a four of tables A, B, C, D.
All of these have 4 exactly the same columns like:
last_modified_by
last_modified_time
active
inactive_date
So in order to avoid code duplicacy, I did:
CREATE TABLE X(
last_modified_by,
last_modified_time,
active,
inactive_date);
Now A, B, C and D will be something like:
CREATE TABLE A (
...,
...,
) INHERITS X;
Now I want a Partitioning in table A by field active. So I will do:
CREATE TABLE A (
...,
...,
) INHERITS X PARTITION BY LIST(active);
But this fails with error: cannot create partitioned table as inheritance child
So How should I do this?
Don't use inheritance just to avoid code duplication. You may have unpleasant surprises, e.g. if you want to change the data type of a column for only one table.
Besides, it won't work together with partitioning, because both use the same technology under the hood.
To avoid code duplication, you can use
CREATE TABLE b (LIKE a);
Related
Is it possible in Postgres to have an optional join?
My use case is something like
select ...
from a
inner join b using (b_id)
where b.type in (...)
a is a very large reporting table. b is used to filter a, BUT the most common use case is that we will want all b.types, and therefore all the b records in the join. In other words, in most cases we don't want to filter by b at all, and would not need the join in that case, but the filtering optionality still needs to be there in cases when the user wants to filter by type.
So is it possible to invoke the join optionally, and save the join effort in cases when we just want all of a?
If not, what's my next best option? IF ... THEN or CTE with a union of separate queries?
If you don't need any of b's columns, there is no need to JOIN table b, You can filter by using EXISTS(SELECT .. FROM b WHERE ...).
If you want to conditionally exclude a part of the WHERE clause, you could use the following construct: (the ignore_b boolean will function as an on/off switch)
-- $ignore_b is a Boolean flag
-- when True, the optimiser will ignore the exists(...)
SELECT ...
FROM a
WHERE ( $ignore_b OR EXISTS (
SELECT *
FROM b
WHERE b.b_id = a.some_id
AND b.type in (1,2,3,4,5)
)
);
In our example, you are still filtering based on b, based on whether a row with that b_id exists in b in the first place.
Postgresql will remove unneeded joins under very specific circumstances. You write the join as a left join, so that no rows of A can be removed due to the absence of corresponding rows in B. The column B.b_id is a declared unique or primary key, so that no rows of A can be duplicated due to duplicate matches in B. And of course, no column of B can referenced in the query (except the reference to the key column in the left join condition).
In those cases, you can just always write the LEFT JOIN, and PostgreSQL will figure out that it can skip it.
You can argue that if you have a declared foreign key constraint on the join condition, then you shouldn't need the JOIN to be a LEFT JOIN in order to implement this optimization. I think that that argument is correct, but PostgreSQL does not implement it that way.
I would just do it programatically. If you are already programmatically adding references to B in the WHERE clause, you should be able to do it for the join as well.
In PostgreSQL:
Does WITH clause create a temporary view or a temporary table? (If I am correct a view stores the code of a query, while a table stores the result of a query)
CREATE TEMPORARY VIEW creates a temporary view available only in the current session.
So what is the difference between a temporary view created by WITH and a temporary view created by CREATE TEMPORARY VIEW?
Database System Concepts seems to imply that WITH creates a temporary view instead of a temporary table:
Since the SQL:1999 version, the SQL standard supports a limited form of recursion, using the with recursive clause, where a view (or temporary view) is
expressed in terms of itself. Recursive queries can be used, for example, to express
transitive closure concisely. Recall that the with clause is used to define a temporary view whose definition is available only to the query in which it is defined.
The additional keyword recursive specifies that the view is recursive.
A common table expression (CTE) is only available for a single query.
A temporary view (like a temporary table) is available for all queries in the current session. It is deleted at the end of the session.
They are not really the same as temporary views.
In postgres CTEs (WITH clause) is materialized into table-like
objects. while views behave more like macros
this effect is most visible when one of the columns is a function that has a side-effect or returns different values.
select generate_series(1,3) as n into temp table a;
a simple table with 1,2,3
create temporary view v as select n,random() as r from a;
select * from v as x join v as y on x.n=y.n;
Using the view: note that random column does not match.
The same sort of result can be had by substituting the view expressions.
select x.n,random(),y.n,random()
from a as x join a as y on x.n=y.n;
or
select * from (select n,random() from a ) as x join
(select n,random() from a ) as y on x.n=y.n;
But with CTE:
with c as (select n,random() as r from a)
select * from c as x join c as y on x.n=y.n;
using the CTE note that the random column matches.
yet another way to make the same query is
Perhaps I'm approaching this all wrong, in which case feel free to point out a better way to solve the overall question, which "How do I use an intermediate table for future queries?"
Let's say I've got tables foo and bar, which join on some baz_id, and I want to use combine this into an intermediate table to be fed into upcoming queries. I know of the WITH .. AS (...) statement, but am running into problems as such:
WITH foobar AS (
SELECT *
FROM foo
INNER JOIN bar ON bar.baz_id = foo.baz_id
)
SELECT
baz_id
-- some other things as well
FROM
foobar
The issue is that (Postgres 9.4) tells me baz_id is ambiguous. I understand this happens because SELECT * includes all the columns in both tables, so baz_id shows up twice; but I'm not sure how to get around it. I was hoping to avoid copying the column names out individually, like
SELECT
foo.var1, foo.var2, foo.var3, ...
bar.other1, bar.other2, bar.other3, ...
FROM foo INNER JOIN bar ...
because there are hundreds of columns in these tables.
Is there some way around this I'm missing, or some altogether different way to approach the question at hand?
WITH foobar AS (
SELECT *
FROM foo
INNER JOIN bar USING(baz_id)
)
SELECT
baz_id
-- some other things as well
FROM
foobar
It leaves only one instance of the baz_id column in the select list.
From the documentation:
The USING clause is a shorthand that allows you to take advantage of the specific situation where both sides of the join use the same name for the joining column(s). It takes a comma-separated list of the shared column names and forms a join condition that includes an equality comparison for each one. For example, joining T1 and T2 with USING (a, b) produces the join condition ON T1.a = T2.a AND T1.b = T2.b.
Furthermore, the output of JOIN USING suppresses redundant columns: there is no need to print both of the matched columns, since they must have equal values. While JOIN ON produces all columns from T1 followed by all columns from T2, JOIN USING produces one output column for each of the listed column pairs (in the listed order), followed by any remaining columns from T1, followed by any remaining columns from T2.
I want to update two columns in my table, one of them depends on the calculation of another updated column. The calculation is rather complex, so I don't want to repeat that every time, I just want to use the newly updated value.
CREATE TABLE test (
A int,
B int,
C int,
D int
)
INSERT INTO test VALUES (0, 0, 5, 10)
UPDATE test
SET
B = C*D * 100,
A = B / 100
So my question, is this even possible to get 50 as the value for column A in just one query?
Another option would be to use persistent computed columns, but will that work when I have dependencies on another computed column?
you cant achieve what you are trying to in a single query.This is due to a Concept called All At Once Operations which translates to "In SQL Server, Operations which appears in Same logical Phase are evaluated at the same time.."..
Below operations wont yield result you are expecting
insert into table1
(t1,t1+100,t1+200)-- sql wont use new t1 incremented value
sames goes with update as well
update t1
set t1=t1*100
t2=t1 --sql wont use t1 updated value(*100)
References:
TSQL Querying by Itzik Ben-Gan
Suppose I have a table with a column that has repeats, e.g.
Column1
---------
a
a
a
a
b
a
c
d
e
... so on
Maybe it has hundreds of thousands of rows. Then say, I need to pull the distinct values from this column. I can do so easily in a SELECT with DISTINCT, but I'm wondering on performance?
I could also give each item in Column1 an id, and then create a new table referenced by Column1 (to normalize this more appropriately). Though, this adds extra complexity to making an insert, and adds in joins for other possible queries.
Is there some way to index just the distinct values in a column, or is the normalization thing the only way to go?
Index on column1 will considerably speed up processing of distinct, but if you are willing to trade some space and some (short) time during insert/update/delete, you can resort to materialized view. This is indexed view you might consider as dynamic table produced and maintained following view definition.
create view view1
with schemabinding
as
select column1,
count_big(*) cnt
from theTable
group by column1
-- create unique clustered index ix_view1 on view1(column1)
(Do not forget to execute commented create index command. I usually do it this way so that view definition contains index definition, reminding me to apply it if I need to change the view.)
When you want to use it be sure to add noexpand hint to force use of materialized data (this part will remain mistery to me - something created as performance enhancement is not turned on by default, but rather activated on spot).
select *
from view1 (noexpand)