Postgres Crosstab allocating values to wrong columns - postgresql

I have a sample table named antest as shown below to test the crosstab function.
create table antest(student text, subject text, result numeric);
insert into antest(student, subject, result) values
('peter','music',2.0),
('peter','language',2.0),
('gabriel','history',8.0),
('john','history',9.0),
('john','maths',4.0),
('john','music',7.0);
student|subject|result
-------+-------+------
peter |music |2.0
peter |lanuage|2.0
gabriel|history|8.0
john |history|9.0
john |maths |4.0
john |music |7.0
Result wanted:
student|music|language|history|maths
-------+-----+--------+-------+-----
peter |2.0 |2.0 | |
gabriel| | |8.0 |
john |7.0 | |9.0 |4.0
I have executed the follwoin query for that:
select *
from public.crosstab (
'select student, subject, result from antest',
'select distinct subject from antest'
) as final_result(student text, music numeric, maths numeric, history numeric, language numeric);
I got the following result:
student|music|maths|history|language
-------+-----+-----+-------+--------
peter |2.0 | | |2.0
gabriel| |8.0 | |
john |7.0 |9.0 |4.0 |
Kindly let me know the mistake I am doing.
I have to repeat this query for anyother database which is 30gb large later with about 75 attributes. Is there any possibility to automate it?

You have to take care that the order of categories is exactly the same in the category query and in the column definition. Because you want an arbitrary chosen (not alphabetical) order, you should use values in the category query.
select *
from crosstab (
$q$ select student, subject, result from antest $q$,
$q$ values ('music'), ('language'), ('history'), ('maths') $q$
) as final_result(student text, music numeric, language numeric, history numeric, maths numeric);
student | music | language | history | maths
---------+-------+----------+---------+-------
peter | 2.0 | 2.0 | |
gabriel | | | 8.0 |
john | 7.0 | | 9.0 | 4.0
(3 rows)
Of course, you can use order by in the query but then you have to change the order in the column definition:
select *
from crosstab (
$q$ select student, subject, result from antest $q$,
$q$ select distinct subject from antest order by 1 $q$
) as final_result(student text, history numeric, language numeric, math numeric, music numeric);
student | history | language | math | music
---------+---------+----------+------+-------
peter | | 2.0 | | 2.0
gabriel | 8.0 | | |
john | 9.0 | | 4.0 | 7.0
(3 rows)

Related

How to use array values from string_to_array to JOIN to a look-up table

I have an animals table like this:
Column | Type | Collation | Nullable | Default
------------------+--------------------------+-----------+----------+---------
animal_code | character varying(3) | | not null |
animal_type_code | character(2) | | not null |
description | character varying(64) | | not null |
Typical content might be:
animal_code | animal_type_code | description
-------------+------------------+------------------------------
XAA | T | Not an animal, but a toaster
1 | D | This is a dog called Bob
2 | C | This cat is called Frank
3 | C | Wilf the cat has three legs
4 | D | Thunder is a dog
An existing stored procedure I'm working with receives a text string containing a comma-separated list of these animal_code values, like this:
store_pet_show_details(
'London', -- p_venue VARCHAR(64)
'2019-12-03', -- p_date TIMESTAMPTZ
'XAA,91,22,23,74,15,64,47,12' -- p_entrants TEXT
);
I'm using unnest(string_to_array(code_csv, ',')) to extract the animal entry codes.
It's probably very simple, but I just want to see if any entrants have an animal_type_code of "T"
Please note that trim.
select animal_code, animal_type_code, description
from animals
inner join (select trim(e) entrant from unnest(string_to_array(p_entrants, ',')) e) t
on animal_code = entrant
where animal_type_code = 'T';

Join and combine tables to get common rows in a specific column together in Postgres

I have a couple of tables in Postgres database. I have joined and merges the tables. However, I would like to have common values in a specific column to appear together in the final table (In the end, I would like to perform groupby and maximum value calculation on the table).
The schema of the test tables looks like this:
Schema (PostgreSQL v11)
CREATE TABLE table1 (
id CHARACTER VARYING NOT NULL,
seq CHARACTER VARYING NOT NULL
);
INSERT INTO table1 (id, seq) VALUES
('UA502', 'abcdef'), ('UA503', 'ghijk'),('UA504', 'lmnop')
;
CREATE TABLE table2 (
id CHARACTER VARYING NOT NULL,
score FLOAT
);
INSERT INTO table2 (id, score) VALUES
('UA502', 2.2), ('UA503', 2.6),('UA504', 2.8)
;
CREATE TABLE table3 (
id CHARACTER VARYING NOT NULL,
seq CHARACTER VARYING NOT NULL
);
INSERT INTO table3 (id, seq) VALUES
('UA502', 'qrst'), ('UA503', 'uvwx'),('UA504', 'yzab')
;
CREATE TABLE table4 (
id CHARACTER VARYING NOT NULL,
score FLOAT
);
INSERT INTO table4 (id, score) VALUES
('UA502', 8.2), ('UA503', 8.6),('UA504', 8.8);
;
I performed join and union and oepration of the tables to get the desired columns.
Query #1
SELECT table1.id, table1.seq, table2.score
FROM table1 INNER JOIN table2 ON table1.id = table2.id
UNION
SELECT table3.id, table3.seq, table4.score
FROM table3 INNER JOIN table4 ON table3.id = table4.id
;
The output looks like this:
| id | seq | score |
| ----- | ------ | ----- |
| UA502 | qrst | 8.2 |
| UA502 | abcdef | 2.2 |
| UA504 | yzab | 8.8 |
| UA503 | uvwx | 8.6 |
| UA504 | lmnop | 2.8 |
| UA503 | ghijk | 2.6 |
However, the desired output should be:
| id | seq | score |
| ----- | ------ | ----- |
| UA502 | qrst | 8.2 |
| UA502 | abcdef | 2.2 |
| UA504 | yzab | 8.8 |
| UA504 | lmnop | 2.8 |
| UA503 | uvwx | 8.6 |
| UA503 | ghijk | 2.6 |
View on DB Fiddle
How should I modify my query to get the desired output?

Access column array and store it in a new table

I have a table like this:
| id | cars | owner |
|----|--------------------------|----------------|
| 1 | {tesla, bmw, mercedes} | Chris Houghton |
| 2 | {toyota, bmw, fiat} | Matt Quinn |
Is there a way to access the car table array DISTINCT values and store them in a new table without duplicate values?
I want this table
| brands |
|--------|
| tesla |
| bmw |
|mercedes|
| toyota |
| fiat |
I believe you are looking for this kind of statement.
SELECT
DISTINCT
table_array.array_unnest
FROM (
SELECT
UNNEST(cars)
FROM
<table>
) AS table_array(array_unnest)
see demo
This indeed works but how can I store them for example in a column
"brand" of a table Manufactures.
INSERT INTO
Manufactures
(brand)
SELECT
DISTINCT
table_array.array_unnest
FROM (
SELECT
UNNEST(cars)
FROM
<table>
) AS table_array(array_unnest)
see demo

PostgreSQL trigger execute procedure or join on NEW

-- trigger function
CREATE OR REPLACE FUNCTION fn_cities_geo_update_event() RETURNS trigger AS $fn_cities_geo_update_event$
BEGIN
is it ok to set a value (geog) which will be used later in the function?
NEW.geog := ST_SetSRID(ST_MakePoint(NEW.longitude,NEW.latitude), 4326)::geography;
this is one way that I tried to find the average of all cities within 90km of a new city so that the new city will have data populated about it
if NEW.rent_avg IS null then
NEW.rent_avg = (
SELECT avg(a.rent_avg)
FROM cities as a
-- I think I'm missing something here... ?
ST_DWithin(a.geog, NEW.geog, 90000)
);
end if;
here is another way that I tried:
if NEW.food_avg IS null then
NEW.food_avg := (
SELECT avg(a.food_avg)
FROM cities AS a
JOIN NEW AS b
ON ST_DWithin(a.geog, b.geog, 90000)
);
end if;
RETURN NEW;
END;
$fn_cities_geo_update_event$ LANGUAGE plpgsql;
but neither worked.
edit: here is a copy of the table that I'm working with
l |u |n |population|unesco|r |c |rent_avg|rent_low|rent_high|food_avg|food_low|food_high|transport_avg|transport_low|transport_high|k |i |quality|hcid |hc |latitude |longitude |spread |density |distance |dbn |state|geog |id
-----|--|--------------------------|----------|------|-------------------------|----------------|--------|--------|---------|--------|--------|---------|-------------|-------------|--------------|---------|---|-------|-------------------------------------|----|-----------|----------|-----------|----------|----------|--------------------------|-----|--------------------------------------------------|-----
false|NZ|Gisborne | 34274| 0|Australia and New Zealand|New Zealand | 92.2430| 51.1720| 143.4150| 22.0300| 13.3190| 35.3490| 7.0650| 5.9800| 13.0450|4VHV8X00+|GIS| 1712|place:Gisborne | 46|-38.6640015|177.977005| 0.99940002| 0| | | |0101000020E6100000000000A0433F664000000000FE5443C0| 1611
true |NZ|Patutahi | 386| |Australia and New Zealand|New Zealand | | | | | | | | | |4VHV9V00+| | 1000|place:Patutahi | 35|-38.6170006|177.899994| | | 8.5|Patutahi | |0101000020E6100000000000C0CC3C6640000000E0F94E43C0| 1624
true |NZ|Waihau Bay | | |Australia and New Zealand|New Zealand | | | | | | | | | |4VJV8Q00+| | 1000|place:Waihau_Bay | 6|-37.6780014|177.796005| | |110.699997|Waihau Bay | |0101000020E6100000000000E078396640000000C0C8D642C0| 1671
true |NZ|Tokomaru Bay | 495| |Australia and New Zealand|New Zealand | | | | | | | | | |4VHWV800+| | 1000|place:Tokomaru_Bay | 5|-38.1329994|178.300003| | |65.4000015|Tokomaru Bay | |0101000020E6100000000000A09949664000000020061143C0| 1673
true |FR|Cornebarrieu | | |Western Europe |France | | | | | | | | | |8FM3M800+| | 1000|place:Cornebarrieu | 112| 43.6559982|1.33299994| 3.60581994| | 3.5999999|Cornebarrieu | |0101000020E6100000000000C0F753F53F000000C0F7D34540| 6070
edit: create trigger statement
DROP TRIGGER IF EXISTS tr_cities_inserted ON cities;
CREATE TRIGGER tr_cities_inserted
BEFORE INSERT ON cities
FOR EACH ROW
EXECUTE PROCEDURE fn_cities_geo_update_event();
Okay, this is interesting.. I am new to triggers (as you may have noticed ^^) so it seems, when I recreate a table AFTER I created a trigger for that table, the trigger doesn't work. It just returns null for the respective columns.
When I stick to the proper order, I get it to work with my sample reproduction of your case:
CREATE TABLE so_postgres
(n text,
r text,
c text,
rent_avg numeric,
food_avg numeric,
transport_avg numeric,
latitude numeric,
longitude numeric,
geog geography
);
INSERT INTO so_postgres
VALUES ('Möhringen', 'central europe', 'germany', 200.45, 56.45, 4.56, 48.725866,
9.146131, ST_SetSRID(ST_Point(9.146131, 48.725866), 4326)),
('Vaihingen', 'central europe', 'germany', 155.33, 44.12, 2.78, 48.732550,
9.108291, ST_SetSRID(ST_Point(9.108291, 48.732550), 4326)),
('Sigmaringen', 'central europe', 'germany', 298.11, 59.67, 1.99, 48.090797,
9.230243, ST_SetSRID(ST_Point(9.230243, 48.090797), 4326));
CREATE OR REPLACE FUNCTION fn_cities_geo_update_event() RETURNS trigger AS $fn_cities_geo_update_event$
BEGIN
NEW.geog := (ST_SetSRID(ST_MakePoint(NEW.longitude,NEW.latitude), 4326)::geography);
if NEW.rent_avg IS null then
NEW.rent_avg := (
SELECT round(avg(a.rent_avg), 2)
FROM so_postgres as a
WHERE ST_DWithin(a.geog, NEW.geog, 50000)
);
end if;
RETURN NEW;
END;
$fn_cities_geo_update_event$ LANGUAGE plpgsql;
CREATE TRIGGER fn_cities_geo_update_event BEFORE INSERT OR UPDATE ON so_postgres FOR EACH ROW EXECUTE PROCEDURE fn_cities_geo_update_event();
INSERT INTO so_postgres (n, r, c, latitude, longitude)
VALUES ('Degerloch', 'central europe', 'germany', 48.725866, 9.146131);
n | r | c | rent_avg | food_avg | transport_avg | latitude | longitude | geog
-------------+----------------+---------+----------+----------+---------------+-----------+-----------+----------------------------------------------------
Möhringen | central europe | germany | 200.45 | 56.45 | 4.56 | 48.725866 | 9.146131 | 0101000020E610000012DDB3AED14A2240A1A3552DE95C4840
Vaihingen | central europe | germany | 155.33 | 44.12 | 2.78 | 48.732550 | 9.108291 | 0101000020E6100000FBE6FEEA71372240A857CA32C45D4840
Sigmaringen | central europe | germany | 298.11 | 59.67 | 1.99 | 48.090797 | 9.230243 | 0101000020E61000000F441669E275224097C9703C9F0B4840
Degerloch | central europe | germany | 177.89 | | | 48.725866 | 9.146131 | 0101000020E610000012DDB3AED14A2240A1A3552DE95C4840
To answer your last comment: I add the ST_DWithin to the rest of the query via a WHERE-clause.
Does it work for you?

How not to calculate a value twice in select?

psql (9.6.1, server 9.5.5)
employees
Column | Type | Modifiers | Storage | Stats target | Description
----------------+-----------------------------+-----------------------------------------------------------------+----------+--------------+---- ---------
employee_id | integer | not null default nextval('employees_employee_id_seq'::regclass) | plain | |
first_name | character varying(20) | | extended | |
last_name | character varying(25) | not null | extended | |
email | character varying(25) | not null | extended | |
phone_number | character varying(20) | | extended | |
hire_date | timestamp without time zone | not null | plain | |
job_id | character varying(10) | not null | extended | |
salary | numeric(8,2) | | main | |
commission_pct | numeric(2,2) | | main | |
manager_id | integer | | plain | |
department_id | integer
I need to extract employee number, last name, salary, salary increased by 15.5 % (expressed as a whole number), and the difference between the new and old salary.
I have done like this:
select employee_id,
last_name,
salary,
round(salary * 1.155, 0) as "New Salary",
round(salary * 1.155, 0) - salary as "Increase"
from employees;
What troubles me is that I have calculated the new salary twice.
I tried to use alias in the same select. Experimented like this:
select 2 as val_a, val_a - 4; --not working
Well, my solution outputs acceptable result. But isn't there a better solution?
That calculation is really nothing if you are worried about performance. Some optimizers may even reuse the calculations internally.
If you must do it yourself, you can use subquery like this:
select t.*,
New_Salary - salary as Increase
from (
select employee_id,
last_name,
salary,
round(salary * 1.155, 0) as New_Salary,
from employees
) t;
You can write it with a subquery, if you are "picky" about not computing twice the same values:
SELECT
*, "New Salary" - salary as "Increase"
FROM
(
SELECT
employee_id,
last_name,
salary,
round(salary * 1.155, 0) as "New Salary"
FROM
employees
) AS s0 ;
In practice, the difference when you run it a few times is neglectable:
dbfiddle here