Let's say we have a table that includes a generated column that concatenates the string of two columns (data_a + data_b):
id
data_a
data_b
generated_data
1
abc
123
'123abc'
2
xyz
890
'xyz890'
... but we want to change the generation logic, for example reversing the concatenation order (data_b + data_a). Would that "backfill" my previous records, or maintain them but only update the new records?
IE, would this change result in this ("backfill"):
id
data_a
data_b
generated_data
1
abc
123
'abc123'
2
xyz
890
'890xyz'
3
lmn
567
'567lmn'
... or this ("maintain")?
id
data_a
data_b
generated_data
1
abc
123
'123abc'
2
xyz
890
'xyz890'
3
lmn
567
'567lmn'
Generated columns:
A stored generated column is computed when it is written (inserted or updated) and occupies storage as if it were a normal column
PostgreSQL currently implements only stored generated columns.
If you could change the expression in place it would not backfill them. You would have to make an explicit UPDATE to make that happen.
The bigger issue is that I can't see anyway to alter the generation code without dropping the column and adding it back with the new expression. Doing so though will change all the column values to the new result of the new expression.
Related
I have 2 tables with the exact same number of rows and the same non-repeated id. Because the data comes from 2 sources I want to keep it 2 tables and not combine it. I assume the best approach would be to leave the unique id as the primary key and join on it?
SELECT * FROM tableA INNER JOIN tableB ON tableA primary key = tableB primary key
The data is used by an application that force the user to select 1 or many values from 5 drop downs in cascading order:
select 1 or many values from tableA column1.
select 1 or many values from tableA column2 but filtered from the first filter.
select 1 or many values from tableA column3 but filtered from the second filter which in turn is filtered from the first filter.
For example:
pk
Column 1
Column 2
Column 3
123
Doe
Jane
2022-01
234
Doe
Jane
2021-12
345
Doe
John
2022-03
456
Jones
Mary
2022-04
Selecting "Doe" from column1 would limit the second filter to ("Jane","John"). And selecting "Jane" from column2 would filter column3 to ("2022-01","2021-12")
And last part of the question;
The application have 3 selection options for column3:
picking the exact value (for example "2022-01") or picking the year ("2022") or picking the quarter that the month falls into ("Q1", which equates in "01","02","03").
What would be the best usage of indexes AND/OR additional columns for this scenario?
Volume of data would be 20-100 million rows.
Each filter is in the range of 5-25 distinct values.
Which version of Postgres do you operate?
The volume you state is rather daunting for such a use case of populating drop-down boxes using live data for a PG db.
No kidding, it's possible, Kibana/Elastic has even a filter widget that works exactly this way for instance.
My guess is you may consider storing the different combinations of search columns in another table simply to speed up populating the dropboxes. You can achieve that with triggers on the two main tables. So instead of additional columns/indexes you may end with an additional table ;)
Regarding indexing strategy and given the hints you stated (AND/OR), I'd say there's no silver bullet. Index the columns that will be queried the most often.
Index each column individually because Postgres starting from 11 IIRC can combine multiple indexes to answer conjunctive/disjunctive formulas in WHERE clauses.
Hope this helps
I have 2 billion of records in table in SQL developer and wanted to export the records in csv file but while exporting data I want to sort one column in ascending order. Is there any efficient or quick way to do this?
for ex:
Suppose the table name is TEMP and i want to sort the A_KEY column in ascending order and then export it
/* TEMP
P_ID ADDRESS A_KEY
1 242 Street 4
2 242 Street 5
3 242 Street 3
4 242 Long St 1
Expected Result in csv file:
P_ID, ADDRESS, A_KEY
4, 242 Long St,1
3, 242 Street,3
1, 242 Street, 4
2, 242 Long St,5
I have tried using below query :
insert into temp2select * from TEMP order by A_KEY ASC;
and then export the table from sqldeveloper but is there any efficient or quick way to direct export records without query?
Losing time on creating a new table (TEMP2) won't help because you are using ORDER BY clause during INSERT, but that means nothing. It is ORDER BY in SELECT statement that matters.
Therefore, run
select * from temp order by a_key;
and export result returned by that query.
2 billion rows? It'll take time. What will you do with such a CSV file? That's difficult to work with.
If you're trying to move data into another Oracle database, then consider using Data Pump export and import utilities which are designed for such a purpose.
I have a requirement where i need to design the table in postgres, where one of the column needs to have autoincrement feature. But the autoincrement should be based on the value on another column.
Table
Column A Column B
100 1
101 1
102 1
102 2
102 3
column A and Column B are the keys to the table. Now if i insert another row, with Column A equated as 100, then column B needs to auto populate as 2. If i attempt to insert value 102 into column A, then column B needs to populate on its own as 4.
Is there a way i can set an attribute to column B, during table creation?
Thanks
Sadiq
#AdrianKlaver is correct. You should use a timestamp and actually record when the version was saved. If you want the version number then you can generate it with the Window row_number function when queried.
select column_a
, row_number() over (partition by column_a
order by column_a,column_a_ver_ts) column_b
from table_name;
Alternative you could use the above query and create a view. See Fiddle
I have the following table
type attribute order
1 11 1
1 12 2
2 11 1
2 12 2
3 15 1
3 16 2
4 15 1
4 16 2
I need to understand which types have identical attributes and then assign them a new id. The order column can be as well if it's helpful because each attribute can only have one order, but you don't need to use it.
Ideally the result set would be the following where you have a new id for each type that is based on the attributes in the first table.
type new_id
1 1
2 1
3 2
4 2
I was planning on trying to pivot the table based on the order column and concatenating the attribute id's to create a new id, but I cannot use crosstab and the number of attributes a type has could vary and I need to account for that.
Any suggestions on what to do here?
This works, there's possibly a better way to do it but it's what came to mind:
SELECT UNNEST(types) AS type, new_id
FROM (
SELECT ARRAY_AGG(type) AS types, ROW_NUMBER() OVER() AS new_id
FROM (
SELECT type, ARRAY_AGG(attribute ORDER BY attribute) AS attr
FROM t
GROUP BY type
) x
GROUP BY attr
) y
Output:
1;1
2;1
3;2
4;2
So first it gets the list of attributes for each type, then it gets the list of types for each common list of attributes (this is where it makes sure each type shares the same attributes) and gets a new id for each group of types. Then unnest that to put each type on a new row, and that row number is the new id.
I have two partitioned tables. Table A is my main table and Table B is full of columns that are exact copies of some of the columns in Table A. However, there is one column in Table B that has data I need- because the matching column in Table A is full of nulls.
I would like to get rid of Table B completely, since most of it is redundant, and update the matching column in Table A with the data from the one column in Table B.
Visually,
Table A: Table B:
a b c d a b d
__________________ ______________
1 null 11 A 1 joe A
2 null 22 B 2 bob B
3 null 33 C 3 sal C
I want to fill the b column in Table A with the values from the b column in Table B, and then I no longer need Table B and can delete it. I will have to do this repeatedly since these two tables are given to me daily from two separate sources.
I cannot key these tables, since they are both partitioned.
I have tried:
update columnb:(exec columnb from TableB) from TableA;
but I get a `length error.
Suggestions on how to approach this in any manner are appreciated.
To replace a column in memory you would do the following.
t1:([]a:1 2 3;b:0N)
a b
---
1
2
3
t2:([]c:`aa`bb`cc;b:5 6 7)
c b
----
aa 5
bb 6
cc 7
t1,'t2
a b c
------
1 5 aa
2 6 bb
3 7 cc
If you are getting length errors then the columns do not have
the same count and the following would solve it. The obvious
problem with this solution is that it will start to repeat
data if t2 has a lower column count that t1. You will have to find out why that is.
t1,'count[t1]#t2
Now for partitions, you will use the amend function to change
the the b column of partitioned table, table A, at date 2007.02.23 (or whatever date your partition is).
This loads the b column of tableB into memory to preform the amend. You must perform the amend for each partition.
#[`:2007.02.23/tableA/;`b;:;count[tableA]#exec b from select b from tableB where date=2007.02.23]