Postgresql replace string in table from another table - postgresql

I have a problem with two tables, those used for accounting
First table named tabela1 have a set of symbol and account. The second table is the symbol, the name to be changed in the first table and the record number of the first table.
Tabela1 is a:
ID |KNT_S_WN | KNT_S_MA |
1 |3021-_R | 3021-_K-_W|
2 |_W-_R | _Z |
Tabelas is a:
ID | SYMBOL |REP |
1 | _R |7Z45 |
1 | _K |321-05 |
1 | _W |490 |
2 | _W |C1 |
2 | _R |C17 |
2 | _Z |320 |
I need this output:
ID |KNT_S_WN | KNT_S_MA |
1 |3021-7Z45 | 3021-321-05-490|
2 |C1-C17 | 320 |
I try this:
update tabela set
knt_s_wn=replace(knt_s_wn,
(select symbol from tabelas where tabela.id=tabelas.id and position(tabelas.symbol in knt_s_wn)>0),
(select a from tabelas where tabela.id=tabelas.id and position(tabelas.symbol in knt_s_wn)>0))
If I use this expression, if it is not knt_s_wn symbol is deleted (blank).
Please help me!!!

One of the simplest solution is to replace strings in a loop inside a plpgsql function:
create or replace function multireplace(aid int, str text)
returns text language plpgsql as $$
declare
rec record;
begin
for rec in
select *
from tabelas
where id = aid
loop
str:= replace(str, rec.symbol, rec.rep);
end loop;
return str;
end $$;
Test it here.

Pure sql solution (ie: without procedural sql) to get:
I need this output:
ID |KNT_S_WN | KNT_S_MA |
1 |3021-7Z45 | 3021-321-05-490|
2 |C1-C17 | 320 |
is below:
with recursive t(id, knt_s_wn, knt_s_ma, symbols, reps) as (
select
tabela.id,
knt_s_wn,
knt_s_ma,
array_agg(symbol),
array_agg(rep)
from tabela
join tabelas on tabelas.id = tabela.id
group by 1, 2, 3
union all
select
id,
replace(knt_s_wn, symbols[1], reps[1]),
replace(knt_s_ma, symbols[1], reps[1]),
array_remove(symbols, symbols[1]),
array_remove(reps, reps[1])
from t
where array_length(symbols, 1) > 0
)
select id, knt_s_wn, knt_s_ma
from t
where symbols = array[]::text[];

Related

DB2: How to transpose mutlidimensional table from row to column to find data changes across rows

I am trying the following with Db2:
Problem
So I've got a table with 80+ columns and two rows.
I need to accomplish is checking what columns have changed value between the two rows, and return a table of the column names that have changed, their initial value from row1, and their new value from row2.
Approach so far
My initial idea was to perform a pivot of the two rows into two columns, row 1 as column 1, row 2 as column 2, then join a column of column names (likely taken from syscat.columns) to the table as column 3, at which point I can then do a select where column1 != column2, hence returning the rows with all the data needed. But alas, it was not long after coming up with this that I discover DB2 doesn't support pivot / unpivot...
Question
So is there any idea for how to accomplish this in DB2, taking a table with 80+ columns and two rows like so:
| Col A | Col B | Col C | ... | Col Z|
| ----- | ----- | ----- | --- | ---- |
| Val A | Val B | 123 | ... | 01/01/2021 |
| Val C | Val B | 124 | ... | 02/01/2021 |
And returning a table with the columns changed, their initial value, and their new value:
| Initial | New | ColName|
| ----- | ----- | ----- |
| Val A | Val C | Col A |
| 123 | 124 | Col C |
| 01/01/2021 | 02/01/2021 | Col Z |
Also note the column data types also vary, so will need to be converted to varchar
DB2 version is 11.1
EDIT: Also for reference as per comment request, this is code I attempted to use to achieve this goal:
WITH
INIT AS (SELECT * FROM TABLE WHERE SOMEDATE=(SELECT MIN(SOMEDATE) FROM TABLE),
LATE AS (SELECT * FROM TABLE WHERE SOMEDATE=(SELECT MAX(SOMEDATE) FROM TABLE),
COLS AS (SELECT COLNAME FROM SYSCAT.COLUMNS WHERE TABNAME='TABLE' ORDER BY COLNO)
SELECT * FROM (
SELECT
COLNAME AS ATTRIBUTE,
(SELECT COLNAME AS INITIAL FROM INIT),
(SELECT COLNAME AS NEW FROM LATE)
FROM
COLS
WHERE
(INITIAL != NEW) OR (INITIAL IS NULL AND NEW IS NOT NULL) OR (INITIAL IS NOT NULL AND NEW IS NULL));
Only issue with this one is that I couldn't figure how to use the values from the COLS table as the columns to be selected
You may easily generate text of the expressions needed, if you don't want to type them manually.
Consider the following example, if you want to print different column values only in 2 rows of the same quite a wide table SYSCAT.TABLES. We use the following query for such an expression generation.
SELECT
'DECODE(I.I, '
|| LISTAGG(COLNO || ', A.' || COLNAME || CASE WHEN TYPENAME NOT LIKE '%CHAR%' AND TYPENAME NOT LIKE '%GRAPHIC' THEN '::VARCHAR(128)' ELSE '' END, ', ')
|| ') AS INITIAL' AS EXPR_INITIAL
, 'DECODE(I.I, '
|| LISTAGG(COLNO || ', B.' || COLNAME || CASE WHEN TYPENAME NOT LIKE '%CHAR%' AND TYPENAME NOT LIKE '%GRAPHIC' THEN '::VARCHAR(128)' ELSE '' END, ', ')
|| ') AS NEW' AS EXPR_NEW
, 'DECODE(I.I, '
|| LISTAGG(COLNO || ', ''' || COLNAME || '''', ', ')
|| ') AS COLNAME' AS EXPR_COLNAME
FROM SYSCAT.COLUMNS C
WHERE TABSCHEMA = 'SYSCAT' AND TABNAME = 'TABLES'
AND TYPENAME NOT LIKE '%LOB';
It doesn't matter how many columns the table contains. We just filter out the columns of *LOB types as an example. If you want them as well, you should change the ::VARCHAR(128) casting to some ::CLOB(XXX).
These 3 generated expressions we put to the corresponding places in the query below:
WITH MYTAB AS
(
-- We enumerate the rows to reference them later
SELECT ROWNUMBER() OVER () RN_, T.*
FROM SYSCAT.TABLES T
WHERE TABSCHEMA = 'SYSCAT'
FETCH FIRST 2 ROWS ONLY
)
SELECT *
FROM
(
SELECT
-- Place here the result got in the EXPR_INITIAL column
-- , Place here the result got in the EXPR_NEW column
-- , Place here the result got in the EXPR_COLNAME column
FROM MYTAB A, MYTAB B
,
(
SELECT COLNO AS I
FROM SYSCAT.COLUMNS
WHERE TABSCHEMA = 'SYSCAT' AND TABNAME = 'TABLES'
AND TYPENAME NOT LIKE '%LOB'
) I
WHERE A.RN_ = 1 AND B.RN_ = 2
)
WHERE INITIAL IS DISTINCT FROM NEW;
The result I got in my database:
|INITIAL |NEW |COLNAME |
|--------------------------|--------------------------|---------------|
|2019-06-04-22.44.14.493001|2019-06-04-22.44.14.502001|ALTER_TIME |
|26 |15 |COLCOUNT |
|2019-06-04-22.44.14.493001|2019-06-04-22.44.14.502001|CREATE_TIME |
|2019-06-04-22.44.14.493001|2019-06-04-22.44.14.502001|INVALIDATE_TIME|
|2019-06-04-22.44.14.493001|2019-06-04-22.44.14.502001|LAST_REGEN_TIME|
|ATTRIBUTES |AUDITPOLICIES |TABNAME |

how to create type in postgres (array) type using custom type

I have custom types in oracle and then I need to populate the same in Postgres.
This oracle type creation code
CREATE OR REPLACE TYPE vr_return_attendee_rp AS OBJECT (
requestid NUMBER,
startdate VARCHAR2(50),
CONSTRUCTOR FUNCTION vr_return_attendee_rp RETURN SELF AS RESULT
);
/
CREATE OR REPLACE TYPE BODY "VR_RETURN_ATTENDEE_RP" AS
CONSTRUCTOR FUNCTION vr_return_attendee_rp RETURN SELF AS RESULT AS
BEGIN
self.requestid := NULL;
self.visitdate := NULL;
return;
END;
END;
/
CREATE OR REPLACE TYPE "VR_RETURN_ATTENDEE_ARR" AS
VARRAY(10000) OF vr_return_attendee_rp;
And then I created type in Postgres like the following
CREATE TYPE vr_return_attendee_rp AS (
requestid INT,
startdate VARCHAR(50)
);
How to create "VR_RETURN_ATTENDEE_ARR" type in Postgres.
PostgreSQL has native support for array types as well as functions and operators to manipulate them.
create type vr_return_attendee_rp as (request_id int, startdate varchar(50));
CREATE TYPE
create table use_vr_type(id serial primary key, vrdata vr_return_attendee_rp[]);
CREATE TABLE
insert into use_vr_type (vrdata) values (array['(1, "2020-01-01")', '(2, "2020-02-01")']::vr_return_attendee_rp[]);
INSERT 0 1
select * from use_vr_type;
id | vrdata
----+-----------------------------------------------
1 | {"(1,\" 2020-01-01\")","(2,\" 2020-02-01\")"}
(1 row)
select id, vrdata[1].request_id, vrdata[1].startdate from use_vr_type;
id | request_id | startdate
----+------------+-------------
1 | 1 | 2020-01-01
(1 row)
select id, unnest(vrdata) from use_vr_type;
id | unnest
----+-------------------
1 | (1," 2020-01-01")
1 | (2," 2020-02-01")
(2 rows)
select id, a.*
from use_vr_type
cross join lateral unnest(vrdata) with ordinality as a(request_id, startdate, rownum);
id | request_id | startdate | rownum
----+------------+-------------+--------
1 | 1 | 2020-01-01 | 1
1 | 2 | 2020-02-01 | 2
(2 rows)

How to return a comma separated values of column without having to loop through the result set

Let say I have this 2 table
+----+---------+ +----+-----------+----------------+
| Id | Country | | Id | CountryId | City |
+----+---------+ +----+-----------+----------------+
| 1 | USA | | 1 | 1 | Washington, DC |
+----+---------+ +----+-----------+----------------+
| 2 | Canada | | 2 | 2 | Ottawa |
+----+---------+ +----+-----------+----------------+
| 3 | 1 | New York |
+----+-----------+----------------+
| 4 | 1 | Baltimore |
+----+-----------+----------------+
I need to produce a result like:
Id | Country | Cities
---+---------+--------------------------------------
1 | USA | Washington, DC, New York, Baltimore
---+------------------------------------------------
2 | Canada | Ottawa
So far, I am looping through the left side table result like this:
DECLARE #table
(
Id INT IDENTITY(1, 1),
CountryId INT,
City VARCHAR(50)
)
DECLARE #tableString
(
Id INT IDENTITY(1, 1),
CountryId INT,
Cities VARCHAR(100)
)
INSERT INTO #table
SELECT Id, City
FROM tblCountries
DECLARE #city VARCHAR(50)
DECLARE #id INT
DECLARE #count INT
DECLARE #i INT = 1
SELECT #count = COUNT(*) FROM #table
WHILE (#i <= #count)
BEGIN
SELECT #Id = Id, #city = City FROM #table WHERE Id = #i
IF(EXISTS(SELECT * FROM #tableString WHERE CountryId = #Id))
BEGIN
UPDATE #tableString SET Cities = Cities + ', ' + #city WHERE Id = #Id
END
ELSE
BEGIN
INSERT INTO #tableString (CountryId, city) VALUES (#Id, #city)
END
SET #i = #i + 1
END
SELECT tc.Id, tc.Country, ts.Cities
FROM tblCountries tc
LEFT JOIN #tableString ts
ON tc.Id = ts.CountryId
My concern is that with all those looping in TSQL, it may be a performance killer. Even with fewer, it appears to be slow. Is there a better way to concatenate those string without having to loop through the data set as if I was working in C#
.
Thanks for helping
This was answered many times, but I've got the feeling, that some explanation might help you...
... am I missing something? It seems like this is related to XML
The needed functionality STRING_AGG() was introduced with SQL-Server 2017. The other direction STRING_SPLIT() came with v2016.
But many people still use older versions (and will do this for years), so we need workarounds. There were approaches with loops, bad and slow... And you might use recursive CTEs. And - that's the point here! - we can use some abilities of XML to solve this.
Try this out:
DECLARE #xml XML=
N'<root>
<element>text1</element>
<element>text2</element>
<element>text3</element>
</root>';
--The query will return the first <element> below <root> and return text1.
SELECT #xml.value(N'(/root/element)[1]','nvarchar(max)');
--But now try this:
SELECT #xml.value(N'(/root)[1]','nvarchar(max)')
The result is text1text2text3.
The reason for this: If you call .value() on an element without a detailed specification of what you want to read, you'll get the whole element back. Find details here.
Now imagine an XML like this
DECLARE #xml2 XML=
N'<root>
<element>, text1</element>
<element>, text2</element>
<element>, text3</element>
</root>';
With the same query as above you'd get , text1, text2, text3. The only thing left is to cut off the leading comma and the space. This is done - in most examples - with STUFF().
So the challenge is to create this XML. And this is what you find in the linked examples.
A general example is this: Read all tables and list their columns as a CSV-list:
SELECT TOP 10
TABLE_NAME
,STUFF(
(SELECT ',' + c.COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS c
WHERE c.TABLE_SCHEMA=t.TABLE_SCHEMA AND c.TABLE_NAME=t.TABLE_NAME
ORDER BY c.COLUMN_NAME
FOR XML PATH('')
),1,1,'') AS AllTableColumns
FROM INFORMATION_SCHEMA.TABLES t

Redshift. Convert comma delimited values into rows with all combinations

I have:
user_id|user_name|user_action
-----------------------------
1 | Shone | start,stop,cancell
I would like to see:
user_id|user_name|parsed_action
-------------------------------
1 | Shone | start
1 | Shone | start,stop
1 | Shone | start,cancell
1 | Shone | start,stop,cancell
1 | Shone | stop
1 | Shone | stop,cancell
1 | Shone | cancell
....
You can create the following Python UDF:
create or replace function get_unique_combinations(list varchar(max))
returns varchar(max)
stable as $$
from itertools import combinations
arr = list.split(',')
response = []
for L in range(1, len(arr)+1):
for subset in combinations(arr, L):
response.append(','.join(subset))
return ';'.join(response)
$$ language plpythonu;
that will take your list of actions and return unique combinations separated by semicolon (elements in combinations themselves will be separated by commas). Then you use a UNION hack to split values into separate rows like this:
WITH unique_combinations as (
SELECT
user_id
,user_name
,get_unique_combinations(user_actions) as action_combinations
FROM your_table
)
,unwrap_lists as (
SELECT
user_id
,user_name
,split_part(action_combinations,';',1) as parsed_action
FROM unique_combinations
UNION ALL
SELECT
user_id
,user_name
,split_part(action_combinations,';',2) as parsed_action
FROM unique_combinations
-- as much UNIONS as possible combinations you have for a single element, with the 3rd parameter (1-based array index) increasing by 1
)
SELECT *
FROM unwrap_lists
WHERE parsed_action is not null

Postgresql Update inside For Loop

I'm new enough to postgresql, and I'm having issues updating a column of null values in a table using a for loop. The table i'm working on is huge so for brevity i'll give a smaller example which should get the point across. Take the following table
+----+----------+----------+
| id | A | B | C |
+----+----------+----------+
| a | 1 | 0 | NULL |
| b | 1 | 1 | NULL |
| c | 2 | 4 | NULL |
| a | 3 | 2 | NULL |
| c | 2 | 3 | NULL |
| d | 4 | 2 | NULL |
+----+----------+----------+
I want to write a for loop which iterates through all of the rows and does some operation
on the values in columns a and b and then inserts a new value in c.
For example, where id = a , update table set C = A*B, or where id = d set C = A + B etc. This would then give me a table like
+----+----------+----------+
| id | A | B | C |
+----+----------+----------+
| a | 1 | 0 | 0 |
| b | 1 | 1 | NULL |
| c | 2 | 4 | NULL |
| a | 3 | 2 | 6 |
| c | 2 | 3 | NULL |
| d | 4 | 2 | 6 |
+----+----------+----------+
So ultimately I'd like to loop through all the rows of the table and update column C according to the value in the "id" column. The function I've written (which isn't giving any errors but also isn't updating anything either) looks like this...
-- DROP FUNCTION some_function();
CREATE OR REPLACE FUNCTION some_function()
RETURNS void AS
$BODY$
DECLARE
--r integer; not too sure if this needs to be declared or not
result int;
BEGIN
FOR r IN select * from 'table_name'
LOOP
select(
case
when id = 'a' THEN B*C
when id = 'd' THEN B+C
end)
into result;
update table set C = result
WHERE id = '';
END LOOP;
RETURN;
END
$BODY$
LANGUAGE plpgsql
I'm sure there's something silly i'm missing, probably around what I'm, returning... void in this case. But as I only want to update existing rows should I need to return anything? There's probably easier ways of doing this than using a loop but I'd like to get it working using this method.
If anyone could point me in the right direction or point out anything blatantly obvious that I'm doing wrong I'd much appreciate it.
Thanks in advance.
No need for a loop or a function, this can be done with a single update statement:
update table_name
set c = case
when id = 'a' then a*b
when id = 'd' then a+b
else c -- don't change anything
end;
SQLFiddle: http://sqlfiddle.com/#!15/b65cb/2
The reason your function isn't doing anything is this:
update table set C = result
WHERE id = '';
You don't have a row with an empty string in the column id. Your function also seems to use the wrong formula: when id = 'a' THEN B*C I guess that should be: then a*b. As C is NULL initially, b*c will also yield null. So even if your update in the loop would find a row, it would update it to NULL.
You are also retrieving the values incorrectly from the cursor.
If you really, really want to do it inefficiently in a loop, the your function should look something like this (not tested!):
CREATE OR REPLACE FUNCTION some_function()
RETURNS void AS
$BODY$
DECLARE
result int;
BEGIN
-- r is a structure that contains an element for each column in the select list
FOR r IN select * from table_name
LOOP
if r.id = 'a' then
result := r.a * r.b;
end if;
if r.id = 'b' then
result := r.a + r.b;
end if;
update table
set C = result
WHERE id = r.id; -- note the where condition that uses the value from the record variable
END LOOP;
END
$BODY$
LANGUAGE plpgsql
But again: if your table is "huge" as you say, the loop is an extremely bad solution. Relational databases are made to deal with "sets" of data. Row-by-row processing is an anti-pattern that will almost always have bad performance.
Or to put it the other way round: doing set-based operations (like my single update example) is always the better choice.