Split dynamic array row data into column names in hive - hiveql

Problem statement: How to convert a row value, which has a dynamic array into column names in Hive.
Eg: table name: TAB1
Col1 Col2
{a,b,c}
{a,b,d,e}
{e,c,a,m,n}
Required output: I somehow need to split the row data dynamic array into column names based on the Col1 as the filter on TAB1.
The Final query needs to be something like below ( i know some sort of JOIN to TAB1 is required)
select 1,a,b,c from TAB2;
select 2,a,b,d,e from TAB2;
select 3,e,c,a,m,n from TAB2;

Try this -
SELECT CONCAT("SELECT ",concat_ws(',', t.*)," FROM TAB2;") FROM TAB1;

Related

Postgresql group into predefined groups where group names come from a database table

I have a database table with data similar to this.
create table DataTable (
name text,
value number
)
insert into DataTable values
('A', 1),('A', 2),('B', 3),('Other', 5),('C', 1);
And i have another table
create table "group" (
name text,
default boolean
)
insert into "group" values
('A', false),('B', false),('Other', true);
I want to group the data in the first table based on the defined groups in the second table.
Expected output
Name | sum
A | 3
B | 3
Other | 6
Right now I'm using this query:
select coalesce(g.name, (select name from group where default = true)) name
sum(dt.value)
from DataTable dt
left join group g on dt.name = g.name
group by 1
This works but can cause performance tips in some situations. Any better way to do this?

How to extract all entries from a JSON array?

Field name: groups
Value:
[{"GroupId": "abcd-41234", "GroupName": "testingrule"}]
How do to extract the GroupID and GroupName values as a separate fields using select statement?
These are my failed attempts:
select groups->>'GroupId' as id,
groups->>'GroupName' as name`
from table_name
select (groups::json->>'GroupId')::json->>'id' as id
from table_name
select groups::json->>'GroupId' as id
from table_name`
I assume that you are using the jsonb data type. If not, change your table definition.
If you want the values for all array elements for all rows in the table, you would use a lateral join like this:
SELECT exp.j ->> 'GroupId' AS groupid,
exp.j ->> 'GroupName' AS groupname
FROM table_name AS t
CROSS JOIN LATERAL jsonb_array_elements(t.groups) AS exp(j);

Build statistics for a tsvector column

I want to build a table where each row contains a string and the number of rows where that string appears as a prefix
Basically I want
select count(*) from "myTable" where tsfield ## (p||':*')::tsquery
for each value of p in an array.
How can I write a query to do this?
Unnest the array and join:
SELECT arr.p, count(*)
FROM "myTable"
JOIN unnest('{...}') AS arr(p)
ON tsfield ## (arr.p||':*')::tsquery
GROUP BY arr.p;

How to aggregate all the resulted rows column data in one column?

I have a case driven query . Below is the simplest form
select Column 1 from mytable
Results :
Column 1
latinnametest
LatinManual
LatinAuto
Is it possible to show the aggregated data of column 1 data of all the resulted rows in another Column say column 5 in front of each row with comma separated ?
Expected :
Column 1 Column 2
latinnametest latinnametest,LatinManual,LatinAuto
LatinManual latinnametest,LatinManual,LatinAuto
LatinAuto latinnametest,LatinManual,LatinAuto
I have used array_agg and concat() but it aggregates the same row data in column 2 but not as expected to add all rows column data comma separated . Any help please.
Edit :
I have tried the solution mentioned below but I am getting repetitive data in the column . see the screenshot. I have hover the mouse over that last column and see the repetitive data . Any solution to this ?
[![enter image description here][1]][1]
You can use string_agg() as a window function:
select column_1,
string_agg(column_1, ',') over () as all_values
from the_table;
Edit, after the scope was changed:
If you need distinct values, use a derived table:
select column_1,
string_agg(column_1, ',') over () as all_values
from (
select distinct column_1
from the_table
) t;
Alternatively with a common table expression:
with vals as (
select string_agg(distinct column_1, ',') as all_values
from the_table
)
select t.column_1, v.all_values
from the_table t
cross join vals v

PostgreSQL group by all fields

I have a query like this:
SELECT
table1.*,
sum(table2.amount) as totalamount
FROM table1
join table2 on table1.key = table2.key
GROUP BY table1.*;
I got the error: column "table1.key" must appear in the GROUP BY clause or be used in an aggregate function.
Are there any way to group "all" field?
There is no shortcut syntax for grouping by all columns, but it's probably not necessary in the described case. If the key column is a primary key, it's enough when you use it:
GROUP BY table1.key;
You have to specify all the column names in group by that are selected and are not part of aggregate function ( SUM/COUNT etc)
select c1,c2,c4,sum(c3) FROM totalamount
group by c1,c2,c4;
A shortcut to avoid writing the columns again in group by would be to specify them as numbers.
select c1,c2,c4,sum(c3) FROM t
group by 1,2,3;
I found another way to solve, not perfect but maybe it's useful:
SELECT string_agg(column_name::character varying, ',') as columns
FROM information_schema.columns
WHERE table_schema = 'your_schema'
AND table_name = 'your_table
Then apply this select result to main query like this:
$columns = $result[0]["columns"];
SELECT
table1.*,
sum(table2.amount) as totalamount
FROM table1
join table2 on table1.key = table2.key
GROUP BY $columns;