How to fuse values from 2 separate columns into a common column in PostgreSQL? - postgresql

Is there a simple way to fuse values from two separate (albeit similar) columns in PostgreSQL?
For example, the following statement:
SELECT a, b FROM stuff;
would currently result in:
a b
-----------
1 2
1 3
1 4
However, I'd like to have the two columns fused in the following way:
ab
---
1
1
1
2
3
4

If you need to get 2 results from same complex query without losing performance try something like:
WITH source AS
(SELECT A,B
FROM your_complex_query)
SELECT A as AB
FROM source
UNION ALL
SELECT B as AB
FROM source

select a as ab from stuff
union all
select b from stuff
order by 1

Related

Inserting multiple rows from an array of arrays in postgreSQL where arrayA[0] =>arrayB[arr1[]], arrayA[1]=>arrayB[arr2[]]

I have this scenario I wish to implement using the query directly in postgresql.
I have user inputs in an array and the size of these arrays can vary. A sample user input is given below:
arrayA=[1,2,3];
arrayB=[[11,22,33],[12,23,34],[4,5,6]];
In the table I need to insert the above data in the following manner
Table A
id dataA dataB
1 1 11
2 1 22
3 1 33
4 2 12
5 2 23
6 2 34
7 3 4
8 3 5
9 3 6
I have tried using unnest() but I'm not able to get the output I want. I'm not sure how to use it to get the required output or if it is the right way to use it. Can someone please help me with this!!
This is a bit tricky due to the two dimensional array, but the following works with your two sample arrays:
select row_number() over (order by a.idx, b.idx) as id,
a.data_a,
b.data_b
from unnest(array[1,2,3]) with ordinality as a(data_a, idx)
cross join lateral unnest( (array[ [11,22,33], [12,23,34], [4,5,6]])[a.idx:a.idx][1:]) with ordinality as b(data_b,idx);
Online example

Regex left join in KDB

In KDB, is it possible to perform a lj (Left-Join) using "like" or "~" to join 2 tables where 1 table's key matches another tables's key by regex?
Not using out-of-the-box tools, but you could do something like this (won't be incredibly efficient)
q)t:([]sym:`ACF`ABC`ABD`BA`AAF`AABG`CDE;col1:til 7)
q)t2:([]regex:("*AB*";"AA?";"A*";"C*");col2:4#.Q.A)
q)t,'t2 first each where each t[`sym]like'\:t2[`regex]
sym col1 regex col2
---------------------
ACF 0 "A*" C
ABC 1 "*AB*" A
ABD 2 "*AB*" A
BA 3 ""
AAF 4 "AA?" B
AABG 5 "*AB*" A
CDE 6 "C*" D
This approach would take the first matched pattern if there's more than one match.
Another idea is to create a manufactured key and left join on the manufactured key.

wrong results with analytic functions in hive

I am trying to use analytic functions with partitioning clause in hive but getting wrong results.
for example, the data is as follows
col1 col2
a 1
a 2
a 3
d 1
d 2
e 1
e 2
dense_rank() over(partition by col1,col2)
is giving 1 as result for all rows.
do we have to enable analytic functions with some set options?
do the underlying table need to be partitioned?
i am using hive on cdh 5

SQL: How to prevent double summing

I'm not exactly sure what the term is for this but, when you have a many-to-many relationship when joining 2 tables and you want to sum up one of the variables, I believe that you can sum the same values over and over again.
What I want to accomplish is to prevent this from happening. How do I make sure that my sum function is returning the correct number?
I'm using PostgreSQL
Example:
Table 1 Table 2
SampleID DummyName SampleID DummyItem
1 John 1 5
1 John 1 4
2 Doe 1 5
3 Jake 2 3
3 Jake 2 3
3 2
If I join these two tables ON SampleID, and I want to sum the DummyItem for each DummyName, how can I do this without double summing?
The solution is to first aggregate and then do the join:
select t1.sampleid, t1.dummyname, t.total_items
from table_1 t1
join (
select t2.sampleid, sum(dummyitem) as total_items
from table_2 t2
group by t2
) t ON t.sampleid = t1.sampleid;
The real question is however: why are the duplicates in table_1?
I would take a step back and try to assess the database design. Specifically, what rules allow such duplicate data?
To address your specific issue given your data, here's one option: create a temp table that contains unique rows from Table 1, then join the temp table with Table 2 to get the sums I think you are expecting.

kdb Update entire column with data from another table

I have two partitioned tables. Table A is my main table and Table B is full of columns that are exact copies of some of the columns in Table A. However, there is one column in Table B that has data I need- because the matching column in Table A is full of nulls.
I would like to get rid of Table B completely, since most of it is redundant, and update the matching column in Table A with the data from the one column in Table B.
Visually,
Table A: Table B:
a b c d a b d
__________________ ______________
1 null 11 A 1 joe A
2 null 22 B 2 bob B
3 null 33 C 3 sal C
I want to fill the b column in Table A with the values from the b column in Table B, and then I no longer need Table B and can delete it. I will have to do this repeatedly since these two tables are given to me daily from two separate sources.
I cannot key these tables, since they are both partitioned.
I have tried:
update columnb:(exec columnb from TableB) from TableA;
but I get a `length error.
Suggestions on how to approach this in any manner are appreciated.
To replace a column in memory you would do the following.
t1:([]a:1 2 3;b:0N)
a b
---
1
2
3
t2:([]c:`aa`bb`cc;b:5 6 7)
c b
----
aa 5
bb 6
cc 7
t1,'t2
a b c
------
1 5 aa
2 6 bb
3 7 cc
If you are getting length errors then the columns do not have
the same count and the following would solve it. The obvious
problem with this solution is that it will start to repeat
data if t2 has a lower column count that t1. You will have to find out why that is.
t1,'count[t1]#t2
Now for partitions, you will use the amend function to change
the the b column of partitioned table, table A, at date 2007.02.23 (or whatever date your partition is).
This loads the b column of tableB into memory to preform the amend. You must perform the amend for each partition.
#[`:2007.02.23/tableA/;`b;:;count[tableA]#exec b from select b from tableB where date=2007.02.23]