I want to select certain elements from an array column. I know you can do it by position, but I want to filter on content. Here's my data:
table_name | column_names
---------------------+---------------------------------------------------------------
attribute_definition | {attribute_type_concept_id}
cohort_definition | {definition_type_concept_id,subject_concept_id}
condition_occurrence | {condition_concept_id,condition_source_concept_id,condition_type_concept_id}
death | {cause_concept_id,cause_source_concept_id,death_impute_concept_id,death_type_concept_id}
device_exposure | {device_concept_id,device_source_concept_id,device_type_concept_id}
drug_exposure | {dose_unit_concept_id,drug_concept_id,drug_source_concept_id,drug_type_concept_id,route_concept_id}
What I would like to say is something like:
SELECT table_name,
array_agg(SELECT colname FROM column_names WHERE colname LIKE '%type%') AS type_cols,
array_agg(SELECT colname FROM column_names WHERE colname NOT LIKE '%type%') AS other_cols
FROM mytable
GROUP BY table_name
And the result I would like would be:
table_name | type_cols | other_cols
----------------------+--------------------------------------------------------------------------------------------------------------
attribute_definition | {attribute_type_concept_id} | {}
cohort_definition | {definition_type_concept_id} | {subject_concept_id}
condition_occurrence | {condition_type_concept_id} | {condition_concept_id,condition_source_concept_id}
death | {death_type_concept_id} | {cause_concept_id,cause_source_concept_id,death_impute_concept_id}
device_exposure | {device_type_concept_id} | {device_concept_id,device_source_concept_id}
drug_exposure | {drug_type_concept_id} | {dose_unit_concept_id,drug_concept_id,drug_source_concept_id,route_concept_id}
So, I want to end up with the same number of rows but different columns. There's gotta be a simple way to do this. Why can't I find it?
unnest is your friend. As in:
SELECT table_name,
array(SELECT colname FROM unnest(column_names) AS colname WHERE colname LIKE '%type%') AS type_cols,
array(SELECT colname FROM unnest(column_names) AS colname WHERE colname NOT LIKE '%type%') AS other_cols
FROM mytable
GROUP BY table_name, column_names
Here is Dan Getz's answer again but in a self-contained statement so it's easily runnable without copying my data.
with grps as
(
with numlist as
(
select '1 - 10' as grp, generate_series(1,10) num
union
select '11 - 20', generate_series(11,20) order by 1,2
)
select grp, array_agg(num) as nums
from numlist
group by 1
)
select grp,
(select array_agg(evens) from unnest(nums) as evens where evens % 2 = 0) as evens,
(select array_agg(odds) from unnest(nums) as odds where odds % 2 != 0) as odds
from grps
group by grp, nums;
grp | evens | odds
---------+------------------+------------------
11 - 20 | {12,14,16,18,20} | {11,13,15,17,19}
1 - 10 | {2,4,6,8,10} | {1,3,5,7,9}
Related
I have two tables
The first table contains three text fields(username, email, num) the second have only one column with random birth_date DATE.
I need to merge tables without condition
For example
first table:
+----------+--------------+-----------+
| username | email | num |
+----------+--------------+-----------+
| 'user1' | 'user1#mail' | '+794949' |
| 'user2' | 'user2#mail' | '+799999' |
+----------+--------------+-----------+
second table:
+--------------+
| birth_date |
+--------------+
| '2001-01-01' |
| '2002-02-02' |
+--------------+
And I need result like
+----------+------------+-------------+--------------+
| username | email | num | birth_date |
+----------+------------+-------------+--------------+
| 'user1' | 'us1#mail' | '+7979797' | '2001-01-01' |
| 'user2' | 'us2#mail' | '+79898998' | '2002-02-02' |
+----------+------------+-------------+--------------+
I need to get in result table with 100 rows too
Tried different JOIN but there is no condition here
Sure there is a join condition, about the simplest there is: Join on true or cross join. Either is the basic merge tables without condition. However this does not result in what you want as it generates a result set of 10k rows. But you an then use limit:
select *
from table1
join table2 on true
order by random()
limit 100;
select *
from table1
cross join table2
order by random()
limit 100;
There is other option, witch I think may be closer to what you want. Assign a value to each row of each table. Then join on this assigned value:
select <column list>
from (select *, row_number() over() rn from table1) t1
join (select *, row_number() over() rn from table2) t2
on (t1.rn = t2.rn);
To eliminate the assigned value you must specifically list each column desired in the result. But that is the way it should be done anyway.
See demo here. (demo user just 3 rows instead of 100)
How to check if any field of array not contains substring in Postgres?
$ select * from blogs;
id | comments
-------+---------------
1 | {str1,str2,str3}
2 | {_substr_,str2,str3}
What I expected is like this:
> select * from mytable where ANY(comments) not like '%substr%';
id | comments
-------+---------------
1 | {str1,str2,str3}
If I use unnest, I will get unpacked array joined with every record(Not expected) like this:https://www.db-fiddle.com/f/9997TuKMMzFUUuyr5VJX7a/0
> select * from (select id,unnest(comments) as cmts from t1) tmp where cmts not like '%substr%'
id | cmts
-------+------
1 | str1
1 | str2
1 | str3
2 | str2
2 | str3
If I use array_to_string(array, delimiter) with not like, I could get what I wanted as following
> select * from (select id,array_to_string(comments, ',') as cmts from blogs) tmp where cmts not like '%substr%';
id | cmts
-------+----------------
1 | str1,str2,str3
But there is a limit: *substr* cann't contains delimiter:
# select * from (select id,array_to_string(comments, ',') as cmts from blogs) tmp where cmts not like '%str1,str2%';
id | cmts
-------+--------------------
2 | _substr_,str2,str3
Is there any better way to filter the whole row if any field of comments not contains specified substring?
How to check if any field of array not contains substring in Postgres?
$ select * from blogs;
id | comments
-------+---------------
1 | {str1,str2,str3}
2 | {_substr_,str2,str3}
What I expected is like this:
> select * from mytable where ANY(comments) not like '%substr%';
id | comments
-------+---------------
1 | {str1,str2,str3}
If I use unnest, I will get unpacked array joined with every record(Not expected)
> select * from (select id,unnest(comments) as cmts from t1) tmp where cmts not like '%substr%'
id | cmts
-------+------
1 | str1
1 | str2
1 | str3
2 | str2
2 | str3
If I use array_to_string(array, delimiter) with not like, I could get what I wanted as following
> select * from (select id,array_to_string(comments, ',') as cmts from blogs) tmp where cmts not like '%substr%';
id | cmts
-------+----------------
1 | str1,str2,str3
But there is a limit: *substr* cann't contains delimiter:
# select * from (select id,array_to_string(comments, ',') as cmts from blogs) tmp where cmts not like '%str1,str2%';
id | cmts
-------+--------------------
2 | _substr_,str2,str3
Is there any better way to filter the whole row if any field of comments not contains specified substring?
If you have a unique id in your table, you can do it like this (result here)
with x as (select *,unnest(arrays) as to_text from t1)
select t1.*
from t1,x
where x.to_text ilike '%sutstr%'
and x.id = t1.id
You can try to use unnest function.
select *
from (
select *,unnest(arrays) as val
from mytable
) tt
WHERE pub_types like '%sutstr%'
If you don't want to unpack arrays, another way you can try to use ARRAY_TO_STRING function with LIKE.
SELECT *
FROM mytable
where ARRAY_TO_STRING(pub_types, ',') LIKE '%sutstr%'
I am trying the following with Db2:
Problem
So I've got a table with 80+ columns and two rows.
I need to accomplish is checking what columns have changed value between the two rows, and return a table of the column names that have changed, their initial value from row1, and their new value from row2.
Approach so far
My initial idea was to perform a pivot of the two rows into two columns, row 1 as column 1, row 2 as column 2, then join a column of column names (likely taken from syscat.columns) to the table as column 3, at which point I can then do a select where column1 != column2, hence returning the rows with all the data needed. But alas, it was not long after coming up with this that I discover DB2 doesn't support pivot / unpivot...
Question
So is there any idea for how to accomplish this in DB2, taking a table with 80+ columns and two rows like so:
| Col A | Col B | Col C | ... | Col Z|
| ----- | ----- | ----- | --- | ---- |
| Val A | Val B | 123 | ... | 01/01/2021 |
| Val C | Val B | 124 | ... | 02/01/2021 |
And returning a table with the columns changed, their initial value, and their new value:
| Initial | New | ColName|
| ----- | ----- | ----- |
| Val A | Val C | Col A |
| 123 | 124 | Col C |
| 01/01/2021 | 02/01/2021 | Col Z |
Also note the column data types also vary, so will need to be converted to varchar
DB2 version is 11.1
EDIT: Also for reference as per comment request, this is code I attempted to use to achieve this goal:
WITH
INIT AS (SELECT * FROM TABLE WHERE SOMEDATE=(SELECT MIN(SOMEDATE) FROM TABLE),
LATE AS (SELECT * FROM TABLE WHERE SOMEDATE=(SELECT MAX(SOMEDATE) FROM TABLE),
COLS AS (SELECT COLNAME FROM SYSCAT.COLUMNS WHERE TABNAME='TABLE' ORDER BY COLNO)
SELECT * FROM (
SELECT
COLNAME AS ATTRIBUTE,
(SELECT COLNAME AS INITIAL FROM INIT),
(SELECT COLNAME AS NEW FROM LATE)
FROM
COLS
WHERE
(INITIAL != NEW) OR (INITIAL IS NULL AND NEW IS NOT NULL) OR (INITIAL IS NOT NULL AND NEW IS NULL));
Only issue with this one is that I couldn't figure how to use the values from the COLS table as the columns to be selected
You may easily generate text of the expressions needed, if you don't want to type them manually.
Consider the following example, if you want to print different column values only in 2 rows of the same quite a wide table SYSCAT.TABLES. We use the following query for such an expression generation.
SELECT
'DECODE(I.I, '
|| LISTAGG(COLNO || ', A.' || COLNAME || CASE WHEN TYPENAME NOT LIKE '%CHAR%' AND TYPENAME NOT LIKE '%GRAPHIC' THEN '::VARCHAR(128)' ELSE '' END, ', ')
|| ') AS INITIAL' AS EXPR_INITIAL
, 'DECODE(I.I, '
|| LISTAGG(COLNO || ', B.' || COLNAME || CASE WHEN TYPENAME NOT LIKE '%CHAR%' AND TYPENAME NOT LIKE '%GRAPHIC' THEN '::VARCHAR(128)' ELSE '' END, ', ')
|| ') AS NEW' AS EXPR_NEW
, 'DECODE(I.I, '
|| LISTAGG(COLNO || ', ''' || COLNAME || '''', ', ')
|| ') AS COLNAME' AS EXPR_COLNAME
FROM SYSCAT.COLUMNS C
WHERE TABSCHEMA = 'SYSCAT' AND TABNAME = 'TABLES'
AND TYPENAME NOT LIKE '%LOB';
It doesn't matter how many columns the table contains. We just filter out the columns of *LOB types as an example. If you want them as well, you should change the ::VARCHAR(128) casting to some ::CLOB(XXX).
These 3 generated expressions we put to the corresponding places in the query below:
WITH MYTAB AS
(
-- We enumerate the rows to reference them later
SELECT ROWNUMBER() OVER () RN_, T.*
FROM SYSCAT.TABLES T
WHERE TABSCHEMA = 'SYSCAT'
FETCH FIRST 2 ROWS ONLY
)
SELECT *
FROM
(
SELECT
-- Place here the result got in the EXPR_INITIAL column
-- , Place here the result got in the EXPR_NEW column
-- , Place here the result got in the EXPR_COLNAME column
FROM MYTAB A, MYTAB B
,
(
SELECT COLNO AS I
FROM SYSCAT.COLUMNS
WHERE TABSCHEMA = 'SYSCAT' AND TABNAME = 'TABLES'
AND TYPENAME NOT LIKE '%LOB'
) I
WHERE A.RN_ = 1 AND B.RN_ = 2
)
WHERE INITIAL IS DISTINCT FROM NEW;
The result I got in my database:
|INITIAL |NEW |COLNAME |
|--------------------------|--------------------------|---------------|
|2019-06-04-22.44.14.493001|2019-06-04-22.44.14.502001|ALTER_TIME |
|26 |15 |COLCOUNT |
|2019-06-04-22.44.14.493001|2019-06-04-22.44.14.502001|CREATE_TIME |
|2019-06-04-22.44.14.493001|2019-06-04-22.44.14.502001|INVALIDATE_TIME|
|2019-06-04-22.44.14.493001|2019-06-04-22.44.14.502001|LAST_REGEN_TIME|
|ATTRIBUTES |AUDITPOLICIES |TABNAME |
I have:
user_id|user_name|user_action
-----------------------------
1 | Shone | start,stop,cancell
I would like to see:
user_id|user_name|parsed_action
-------------------------------
1 | Shone | start
1 | Shone | start,stop
1 | Shone | start,cancell
1 | Shone | start,stop,cancell
1 | Shone | stop
1 | Shone | stop,cancell
1 | Shone | cancell
....
You can create the following Python UDF:
create or replace function get_unique_combinations(list varchar(max))
returns varchar(max)
stable as $$
from itertools import combinations
arr = list.split(',')
response = []
for L in range(1, len(arr)+1):
for subset in combinations(arr, L):
response.append(','.join(subset))
return ';'.join(response)
$$ language plpythonu;
that will take your list of actions and return unique combinations separated by semicolon (elements in combinations themselves will be separated by commas). Then you use a UNION hack to split values into separate rows like this:
WITH unique_combinations as (
SELECT
user_id
,user_name
,get_unique_combinations(user_actions) as action_combinations
FROM your_table
)
,unwrap_lists as (
SELECT
user_id
,user_name
,split_part(action_combinations,';',1) as parsed_action
FROM unique_combinations
UNION ALL
SELECT
user_id
,user_name
,split_part(action_combinations,';',2) as parsed_action
FROM unique_combinations
-- as much UNIONS as possible combinations you have for a single element, with the 3rd parameter (1-based array index) increasing by 1
)
SELECT *
FROM unwrap_lists
WHERE parsed_action is not null
I need to get ID by joining columns of tables with variable length.
Table A has 2 columns ID and PostCode
-----------------
| ID | PostCode |
|----|----------|
| 1 | BR |
|----|----------|
| 2 | WT |
|----|----------|
| 3 | B71 |
|----|----------|
| 4 | BR5 |
|----|----------|
Table B has columns with Name and Full postcode
|------|----------|
| Name | PostCode |
|------|----------|
| Mr X | CR2 5ER |
|------|----------|
| Ms Y | BT2 6ER |
|------|----------|
| XX | B71 4WQ |
|------|----------|
| YY | BR4 8ER |
|------|----------|
| SS | BR5A 5RT |
|------|----------|
I need to get Id's 1 [BR->BR4 8ER], 3 [B71->B71 4WQ] and 4 [BR5->BR5A 5RT]
How do I get to work this?
select A.PostCode, B.PostCode as FullPostCode, B.Name
from A
join B
on substring(B.PostCode,0,len(A.PostCode)) = A.PostCode
Consider the postcode BR29 8LN. If table A has codes B and BR, this postcode will be captured TWICE - not what the OP would want, and not what I wanted.
The below captures everything so long as after the postcode prefix, there is a number thus delimiting the postcode area:
select A.PostCode, B.PostCode as FullPostCode, B.Name
from B
inner join A
on substring(B.PostCode ,0,len(A.PostCode)+1) = A.PostCode
WHERE IsNumeric(substring(B.PostCode ,len(A.PostCode)+1,1)) = 1
This may help.
DECLARE #TableA TABLE (UserID INT,
PostCode VARCHAR(10))
DECLARE #TableB TABLE (Name VARCHAR(10),
PostCode VARCHAR(10))
INSERT INTO #TableA
VALUES
('1', 'BR'),
('2', 'WT'),
('3', 'B71'),
('4', 'BR5')
INSERT INTO #TableB
VALUES
('Mr X', 'CR2 5ER'),
('Ms Y', 'BT2 6ER'),
('XX', 'B71 4WQ'),
('YY', 'BR4 8ER'),
('SS', 'BR5A 5RT');
WITH CTE
AS (
SELECT CAST(UserID AS VARCHAR(10)) AS UserID,
Name,
tb.PostCode,
ta.PostCode AS PostCode2
,
ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY tb.PostCode DESC) AS PcID
FROM #TableA AS ta
JOIN #TableB AS tb
ON ta.PostCode = LEFT(tb.PostCode, LEN(ta.PostCode))
)
, cte2
AS (
SELECT STUFF((SELECT ', ' + c2.UserID + ' [' + c2.PostCode2 + '-' + c2.PostCode + ']'
FROM cte AS c2
WHERE c1.UserID = c2.UserID
AND PcID = 1
FOR XML PATH('')), 1, 2, '') AS PostCodeMatch
FROM cte AS c1
WHERE PcID = 1
)
SELECT DISTINCT STUFF((SELECT ', ' + PostCodeMatch
FROM cte2 AS c2
FOR XML PATH('')), 1, 2, '') AS PostCodeMatch
FROM cte2
You might do something like this:
select A.PostCode, B.PostCode as FullPostCode, B.Name
from A
join B on B.PostCode like A.PostCode + '%'