Getting jsonb field names from query result - postgresql

I have two table like this:
create table product (
id serial primary key,
name text
);
create table selectedattribute (
id serial primary key,
product integer references product,
attribute text,
val text
);
and I'm creating a materialized view with this select query
select product.name,
jsonb_build_object(
'color', COALESCE(jsonb_agg(val) FILTER (WHERE attribute='color'), '[]'),
'diameter', COALESCE(jsonb_agg(val) FILTER (WHERE attribute='diameter'), '[]')
)
from product
left join selectedattribute on product.id = selectedattribute.product
group by product.id;
the problem with this select query is when I add a new attribute, I have to add it to select query in order to create an up-to-date materialized view.
Is there a way to write an aggregate expression that dynamically gets attributes without all these hard-coded attribute names?
You can try my code in SQL Fiddle: http://sqlfiddle.com/#!17/c4150/4

You need to nest the aggregation. First collect all values for an attribute then aggregate that into a JSON:
select id, name, jsonb_object_agg(attribute, vals)
from (
select p.id, p.name, a.attribute, jsonb_agg(a.val) vals
from product p
left join selectedattribute a on p.id = a.product
group by p.id, a.attribute
) t
group by id, name;
Updated SQLFiddle: http://sqlfiddle.com/#!17/c4150/5

Related

How do I make my RANK () OVER query work in select?

table image
I have this table that I need to sort in the following way:
need to rank Departments by Salary;
need to show if Salary = NULL - 'No data to be shown' message
need to add total salary paid to the department
need to count people in the department
SELECT RANK() OVER (
ORDER BY Salary DESC
)
,CASE
WHEN Salary IS NULL
THEN 'NO DATA TO BE SHOWN'
ELSE Salary
,Count(Fname)
,Total(Salary) FROM dbo.Employees
I get an error saying:
Column 'dbo.Employees.Salary' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Why so?
Column 'dbo.Employees.Salary' is invalid in the select list because it
is not contained in either an aggregate function or the GROUP BY
clause.
Why so?
The aggregate functions are returning a single value for the whole table, you can't SELECT a field alongside them it doesn't makes sense. Like say, you have a students table you apply Sum(marks) for the whole students table, and you are then also selecting student's name Select studentname in your query. Which student's name will the database engine select? Confusing
Column "invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause"
I tried this-
using inner query
SELECT RANK() OVER (ORDER BY SAL DESC) RANK,FNAME,DEPARTMENT
CASE
WHEN SAL IS NULL THEN 'NO DATA TO BE SHOWN'
ELSE SAL
END
FROM
(SELECT COUNT(FNAME) FNAME, SUM(SALARY) SAL, DEPARTMENT
FROM TESTEMPLOYEE
GROUP BY DEPARTMENT) t

show records that have only one matchin row in another table

I need to write a sql code that probably is very simple but I am very new to it.
I need to find all the records from one table that have matching id (but no more than one) from the other table. eg. one table contains records of the employees and the second one with employees' telephone numbers. i need to find all employees with only one telephone no
Sample data would be nice. In absence of:
SELECT
employees.employee_id
FROM
employees
LEFT JOIN
(SELECT distinct on(employee_id) employee_id FROM emp_phone) AS phone
ON
employees.employee_id = phone.employee_id
WHERE
phone.employee_id IS NOT NULL;
You need a join of the 2 tables, group by employee and the condition in the having clause:
SELECT e.employee_id, e.name
FROM employees e INNER JOIN numbers n
ON e.employee_id = n.employee_id
GROUP BY e.employee_id, e.name
HAVING COUNT(*) = 1;
If there can be more than a few numbers per employee in the table with the employees' telephone numbers (calling it tel), then it's cheaper to avoid GROUP BY and HAVING which has to process all rows. Find employees with "unique" numbers using a self-anti-join with NOT EXISTS.
While you don't need more than the employee_id and their unique phone number, you don't even have to involve the employee table at all:
SELECT *
FROM tel t
WHERE NOT EXISTS (
SELECT FROM tel
WHERE employee_id = t.employee_id
AND tel_number <> t.tel_number -- or use PK column
);
If you need additional columns from the employee table:
SELECT * -- or any columns you need
FROM (
SELECT employee_id AS id, tel_number -- or any columns you need
FROM tel t
WHERE NOT EXISTS (
SELECT FROM tel
WHERE employee_id = t.employee_id
AND tel_number <> t.tel_number -- or use PK column
)
) t
JOIN employee e USING (id);
The column alias in the subquery (employee_id AS id) is just for convenience. Then the outer join condition can be USING (id), and the ID column is only included once in the result, even with SELECT * ...
Simpler with a smart naming convention that uses employee_id for the employee ID everywhere. But it's a widespread anti-pattern to use employee.id instead.
Related:
JOIN table if condition is satisfied, else perform no join

How to find the index of of value in array in postgresql?

I have this table and some sample data as well. I want to get the index of each value in array in separate column.
CREATE TABLE contacts (
id serial PRIMARY KEY,
name VARCHAR (100),
phones TEXT []
);
Sample data.
INSERT INTO contacts (name, phones)
VALUES
(
'John Doe',
'{"(408)-589-5846","(408)-589-5555"}'
),
(
'Lily Bush',
'{"(408)-589-5841"}'
),
(
'William Gate',
'{"(408)-589-5842","(408)-589-58423"}'
);
Now I run this query to unnest the data into rows which is something like this.
select name, unnest(phones) from contacts
It gives me the data correctly but I want the number of index for the phone numbers in another column which will help me identify which phone number is at which index.
I came to know with array_position() function but it's not working as expected and throwing some error, maybe I'm not putting in right way. I am new to postgresql so any help would be appreciated.
Use unnest() in the FROM clause and you get the the index using the option with ordinality
select c.name,
p.phone,
p.idx
from contacts c
cross join lateral unnest(phones) with ordinality as p(phone, idx)
order by c.id, p.idx;
Online example
The above would not return rows from the contacts table that have an empty phones array, if you need that you need to use a LEFT JOIN
select c.name,
p.phone,
p.idx
from contacts c
left join lateral unnest(phones) with ordinality as p(phone, idx) on true
order by c.id, p.idx;

Add Column in table with value partition by group

My table is somethingg like
CREATE TABLE table1
(
_id text,
name text,
data_type int,
data_value int,
data_date timestamp -- insertion time
);
Now due to a system bug, many duplicate entries are created and I need to remove those duplicated and keep only unique entries excluding data_date because it is a system generated date.
My query to do that is something like:
DELETE FROM table1 A
USING ( SELECT _id, name, data_type, data_value, MIN(data_date) min_date
FROM table1
GROUP BY _id, name, data_type, data_value
HAVING count(data_date) > 1) B
WHERE A._id = B._id
AND A.name = B.name
AND A.data_type = B.data_type
AND A.data_value = B.data_value
AND A.data_date != B.min_date;
However this query works, having millions of records in the table, I want a faster way for it. My idea is to create a new column with value as partition by [_id, name, data_type, data_value] or columns which are in group by. However, I could not find the way to create such column.
I would appretiate if any one may suggest a way to create such column.
Edit 1:
There is another thing to add, I don't want to use CTE or subquery for updating this new column because it will be same as my existing query.
The best way is simply creating a new table without duplicated records:
CREATE...
SELECT _id, name, data_type, data_value, MIN(data_date) min_date
FROM table1
GROUP BY _id, name, data_type, data_value;
Alternatively, you can create a rank and then filter, but a subquery is needed.
RANK() OVER (PARTITION BY your_variables ORDER BY data_date ASC) r
And then filter r=1.

Joined aggregate function update?

I'm trying to denormalize an aggregate bit of data for performance, but can't figure out how to get the aggregation working...
CREATE TABLE brands (
id SERIAL,
name TEXT,
total INTEGER,
unitcount INTEGER
)
CREATE TABLE items (
brandid INTEGER,
id SERIAL,
unitvalue INTEGER
)
UPDATE brands SET b.total = i.sumScore,
b.unitcount = i.unitcount
FROM brands b
INNER JOIN
(
SELECT brandid,
SUM(unitvalue) sumScore,
COUNT(unitvalue) unitcount
FROM items
group by brandid
) as a
ON i.brandid = b.id
This updates EVERY record in brand with the same values, despite the inner join query showing a correct table set of distinct values for each brand. How can I get that correlated?
please try
UPDATE brands b
INNER JOIN (
SELECT brandid,SUM(unitvalue) value_to_update,COUNT(unitvalue) cnt
FROM items GROUP BY brandid) abc
ON b.id=abc.brandid
SET b.total=abc.value_to_update,b.unitcount=abc.cnt