Assign a constant integer to existing newly inserted values - postgresql

I'm looking for an intelligent sequence generator that will be able to assign a constant int value to a column if a string column values exists in the table already. The scenario is as below
_____________________________
| Col1 | Col2 | Col 3 |
|---------------------------|
| a | a | 1 |
| b | a | 1 |
| c | a | 1 |
| u | b | 2 |
| v | b | 2 |
| w | b | 2 |
-----------------------------
Let's say I insert another value which is ('d','a') to Col1 & Col2 respectively, I want Col3 to become '1' automatically as the Col3 value corresponding to 'a' already exists as '1' and become as seen below
_____________________________
| Col1 | Col2 | Col 3 |
|---------------------------|
| a | a | 1 |
| b | a | 1 |
| c | a | 1 |
| u | b | 2 |
| v | b | 2 |
| w | b | 2 |
| d | a | 1 |
------------------------------
Is there a way I can define it in 'Create Table' so that the Col3 value update happens upon value insertion into Col1, Col2?
Edit :
The scenario is something like this
______________________________________________________________
| Col1 | Col2 | Col 3 |
|------------------------------------------------------------|
| Adobe | Adobe | 1 |
| Adobe Systems | Adobe | 1 |
| Adobe Systems Inc | Adobe | 1 |
| Honeywell | Honeywell | 2 |
| Honeywell Inc | Honeywell | 2 |
| Honeywell Inc. | Honeywell | 2 |
--------------------------------------------------------------
And when I add new data, I would like it to be
______________________________________________________________
| Col1 | Col2 | Col 3 |
|------------------------------------------------------------|
| Adobe | Adobe | 1 |
| Adobe Systems | Adobe | 1 |
| Adobe Systems Inc | Adobe | 1 |
| Honeywell | Honeywell | 2 |
| Honeywell Inc | Honeywell | 2 |
| Honeywell Inc. | Honeywell | 2 |
| Adobe Systems Incorporated | Adobe | 1 |
--------------------------------------------------------------
The Col3 value has to be an integer for faster joins with other tables. I will insert values for Col1 & Col2, and the corresponding value should be available in Col3 upon insert.

Just normalize it:
create table corporation (
corporation_id serial,
short_name text
);
insert into corporation (short_name) values
('Adobe'),('Honeywell');
select * from corporation;
corporation_id | short_name
----------------+------------
1 | Adobe
2 | Honeywell
Now you table is:
| Adobe | 1
| Adobe Systems | 1
| Adobe Systems Inc | 1
| Honeywell | 2
| Honeywell Inc | 2
| Honeywell Inc. | 2
| Adobe Systems Incorporated | 1

The Col3 value has to be an integer for faster joins with other
tables.
So you need unique, but not necessarily sequential, values. You can use a hash function to map strings into unique integers, e. g.:
postgres=# select hashtext('Adobe');
hashtext
-----------
173079840
(1 row)
postgres=# select hashtext('Honeywell');
hashtext
------------
-453308048
(1 row)
With such a function, you avoid the need to lookup existing values in the table.
But are you really sure you will have performance problems using strings for foreign keys? You should test it on your data, I think. (Also test on upcoming 9.5, with its abbreviated keys feature.)

Related

Create a PostgreSQL function that becomes a formula field of a table retrieving related data from other table

The example above can be done on a SQL Server. It is a function that performs the calculation on another table while getting the current table field Id to list data from other table, return a single value.
Question: how to do the exact thing with PostgreSQL
SELECT TOP(5) * FROM Artists;
+------------+------------------+--------------+-------------+
| ArtistId | ArtistName | ActiveFrom | CountryId |
|------------+------------------+--------------+-------------|
| 1 | Iron Maiden | 1975-12-25 | 3 |
| 2 | AC/DC | 1973-01-11 | 2 |
| 3 | Allan Holdsworth | 1969-01-01 | 3 |
| 4 | Buddy Rich | 1919-01-01 | 6 |
| 5 | Devin Townsend | 1993-01-01 | 8 |
+------------+------------------+--------------+-------------+
SELECT TOP(5) * FROM Albums;
+-----------+------------------------+---------------+------------+-----------+
| AlbumId | AlbumName | ReleaseDate | ArtistId | GenreId |
|-----------+------------------------+---------------+------------+-----------|
| 1 | Powerslave | 1984-09-03 | 1 | 1 |
| 2 | Powerage | 1978-05-05 | 2 | 1 |
| 3 | Singing Down the Lane | 1956-01-01 | 6 | 3 |
| 4 | Ziltoid the Omniscient | 2007-05-21 | 5 | 1 |
| 5 | Casualties of Cool | 2014-05-14 | 5 | 1 |
+-----------+------------------------+---------------+------------+-----------+
The function
CREATE FUNCTION [dbo].[ufn_AlbumCount] (#ArtistId int)
RETURNS smallint
AS
BEGIN
DECLARE #AlbumCount int;
SELECT #AlbumCount = COUNT(AlbumId)
FROM Albums
WHERE ArtistId = #ArtistId;
RETURN #AlbumCount;
END;
GO
Now, (at SQL Server), after update the first table fields with ALTER TABLE Artists ADD AlbumCount AS dbo.ufn_AlbumCount(ArtistId); whe can list and get the following result.
+------------+------------------+--------------+-------------+--------------+
| ArtistId | ArtistName | ActiveFrom | CountryId | AlbumCount |
|------------+------------------+--------------+-------------+--------------|
| 1 | Iron Maiden | 1975-12-25 | 3 | 5 |
| 2 | AC/DC | 1973-01-11 | 2 | 3 |
| 3 | Allan Holdsworth | 1969-01-01 | 3 | 2 |
| 4 | Buddy Rich | 1919-01-01 | 6 | 1 |
| 5 | Devin Townsend | 1993-01-01 | 8 | 3 |
| 6 | Jim Reeves | 1948-01-01 | 6 | 1 |
| 7 | Tom Jones | 1963-01-01 | 4 | 3 |
| 8 | Maroon 5 | 1994-01-01 | 6 | 0 |
| 9 | The Script | 2001-01-01 | 5 | 1 |
| 10 | Lit | 1988-06-26 | 6 | 0 |
+------------+------------------+--------------+-------------+--------------+
but how to achieve this on postgresql?
Postgres doesn't support "virtual" computed column (i.e. computed columns that are generated at runtime), so there is no exact equivalent. The most efficient solution is a view that counts this:
create view artists_with_counts
as
select a.*,
coalesce(t.album_count, 0) as album_count
from artists a
left join (
select artist_id, count(*) as album_count
from albums
group by artist_id
) t on a.artist_id = t.artist_id;
Another option is to create a function that can be used as a "virtual column" in a select - but as this is done row-by-row, this will be substantially slower than the view.
create function album_count(p_artist artists)
returns bigint
as
$$
select count(*)
from albums a
where a.artist_id = p_artist.artist_id;
$$
language sql
stable;
Then you can include this as a column:
select a.*, a.album_count
from artists a;
Using the function like that, requires to prefix the function reference with the table alias (alternatively, you can use album_count(a))
Online example

PostgreSQL: Transforming rows into columns when more than three columns are needed

I have a table like the following one:
+---------+-------+-------+-------------+--+
| Section | Group | Level | Fulfillment | |
+---------+-------+-------+-------------+--+
| A | Y | 1 | 82.2 | |
| A | Y | 2 | 23.2 | |
| A | M | 1 | 81.1 | |
| A | M | 2 | 28.2 | |
| B | Y | 1 | 89.1 | |
| B | Y | 2 | 58.2 | |
| B | M | 1 | 32.5 | |
| B | M | 2 | 21.4 | |
+---------+-------+-------+-------------+--+
And this would be my desired output:
+---------+-------+--------------------+--------------------+
| Section | Group | Level1_Fulfillment | Level2_Fulfillment |
+---------+-------+--------------------+--------------------+
| A | Y | 82.2 | 23.2 |
| A | M | 81.1 | 28.2 |
| B | Y | 89.1 | 58.2 |
| B | M | 32.5 | 21.4 |
+---------+-------+--------------------+--------------------+
Thus, for each section and group I'd like to obtain their percents of fulfillment for level 1 and level 2. To achieve this, I've tried crosstab(), but using this function returns me an error ("The provided SQL must return 3 columns: rowid, category, and values.") because I'm using more than three columns (I need to maintain section and group as identifiers for each row). Is possible to use crosstab in this case?
Regards.
I find crosstab() unnecessary complicated to use and prefer conditional aggregation:
select section,
"group",
max(fulfillment) filter (where level = 1) as level_1,
max(fulfillment) filter (where level = 2) as level_2
from the_table
group by section, "group"
order by section;
Online example

postgres sql : getting unified rows

I have one table where I dump all records from different sources (x, y, z) like below
+----+------+--------+
| id | source |
+----+--------+
| 1 | x |
| 2 | y |
| 3 | x |
| 4 | x |
| 5 | y |
| 6 | z |
| 7 | z |
| 8 | x |
| 9 | z |
| 10 | z |
+----+--------+
Then I have one mapping table where I map values between sources based on my usecase like below
+----+-----------+
| id | mapped_id |
+----+-----------+
| 1 | 2 |
| 1 | 9 |
| 3 | 7 |
| 4 | 10 |
| 5 | 1 |
+----+-----------+
I want merged results where I can see only unique results like
+-----+------------+
| id | mapped_ids |
+-----+------------+
| 1 | 2,9,5 |
| 3 | 7 |
| 4 | 10 |
| 6 | null |
| 8 | null |
+-----+------------+
I am trying different options but could not figure this out, is there way I can write joins to do this. I have to use the mapping table where associations are stored and identify unique records along with records which are not mapped anywhere.
My understanding is, you want to see all dump_table IDs that do not appear in the mapping_id column and then aggregate the mapped_ids for those that are left:
select d1.id,
array_agg(m1.mapped_id order by m1.mapped_id) filter (where m1.mapped_id is not null) as mapped_ids
from dump_table d1
left join mapping_table m1 using (id)
where not exists (select *
from mapping_table m2
where m2.mapped_id = d1.id)
group by d1.id;
Online example: https://rextester.com/JQZ17650
Try something like this:
SELECT id, name, ARRAY_AGG(mapped_id) AS mapped_ids
FROM table1 AS t1
LEFT JOIN table2 AS t2 USING (id)
GROUP BY id, name

postgresql condition column name = data

I have three tables.
table a
| uid | name | number |
|-----+---------+--------|
| 1 | table | 1 |
| 2 | chair | 2 |
table b
| uid | name | number |
|-----+---------+--------|
| 1 | john | 1 |
| 2 | billy | 0 |
| 3 | bob | 2 |
| 4 | sally | 1 |
table c
| uid | table a | john | billy | bob | sally |
|-----+---------+--------+--------+-------|-------|
| 1 | table | T | | | T |
| 2 | chair | | | C | |
What I need to be able to do is look at the column names in table c, find the corresponding row entry in table b and if the numbers are the same as the numbers from table a, then set the table c value to T. Otherwise set it to C. This needs to be triggered whenever either table a or table b are updated.
How can I write an update the table using if or case statement.
Thanks.

how to return number of records as a part of a select statement?

I'd like to know if there is a way to include row numbers (basically telling me how many records I'm getting back from a database query).
I have the following SQL query
SELECT w.widget_id, w.class_id, wg.name classname, wg.label AS classgroup, c.label, c.seq,
g.name AS group, p.name, p.type, CASE WHEN v.value IS NOT NULL THEN v.value WHEN g2p.value IS NOT NULL THEN g2p.value ELSE p.value END AS value
FROM widgets_to_categories w
INNER JOIN widget_classes c ON w.class_id = c.class_id
JOIN classes_to_param_groups t2g ON c.class_id = t2g.class_id
JOIN widget_groups g ON t2g.group_id = g.group_id
JOIN param_groups_to_params g2p ON t2g.group_id = g2p.group_id
JOIN provisioning_params p ON g2p.param_id = p.param_id
INNER JOIN widget_cat_groups wg ON c.class_group_id = wg.class_group_id
LEFT JOIN widget_values v ON(w.widget_id=v.device_id AND p.param_id=v.param_id AND g.name=v.group_name )
WHERE w.widget_id=8 ORDER BY c.class_id ASC
And it returns data like:
widget_id | class_id | classname | classgroup | label | seq | group | name | type | value
8 | 1 | toy | group A | test label | 1 | toy | reg | text | af
8 | 1 | toy | group A | test label | 1 | reg2 | fall | text | 25327
8 | 1 | toy | group A | test label | 1 | reg2 | pd | text | dvaa
8 | 1 | toy | group A | test label | 1 | reg2 | ext | text | 28235
8 | 1 | toy | group A | test label | 1 | reg1 | ext | text | 28230
8 | 1 | toy | group A | test label | 1 | toy | meec | text | 094F22DE501
8 | 1 | toy | group A | test label | 1 | toy | mmap | text | 0|
8 | 1 | toy | group A | test label | 1 | reg1 | fna | text | 26014
8 | 1 | toy | group A | test label | 1 | reg1 | fall | text | t-123
8 | 1 | toy | group A | test label | 1 | toy | uen | boolean | false
8 | 1 | toy | group A | test label | 1 | toy | adminpd |
I'd like to know if there's a way to have the database auto generate and return another column that is just an identifier for the row, like so:
id |widget_id | class_id | classname | classgroup | label | seq | group | name | type | value
1 | 8 | 1 | toy | group A | test label | 1 | toy | reg | text | af
2 | 8 | 1 | toy | group A | test label | 1 | reg2 | fall | text | 25327
3 | 8 | 1 | toy | group A | test label | 1 | reg2 | pd | text | dvaa
4 | 8 | 1 | toy | group A | test label | 1 | reg2 | ext | text | 28235
5 | 8 | 1 | toy | group A | test label | 1 | reg1 | ext | text | 28230
6 | 8 | 1 | toy | group A | test label | 1 | toy | meec | text | 094F22DE501
7 | 8 | 1 | toy | group A | test label | 1 | toy | mmap | text | 0|
8 | 8 | 1 | toy | group A | test label | 1 | reg1 | fna | text | 26014
9 | 8 | 1 | toy | group A | test label | 1 | reg1 | fall | text | t-123
10 | 8 | 1 | toy | group A | test label | 1 | toy | uen | boolean | false
11 | 8 | 1 | toy | group A | test label | 1 | toy | adminpd | boolean | false
I think I can do this by selecting into a temporary table.. I haven't figured out the syntax on how to do it yet... But I'm also wondering if there's another simpler way.
Once I get the data back from the database, having this ID field makes it eaiser to manipulate.
Thanks.
You can use the row_number window function to keep track of each row number.
Like so:
create table foo
(
id serial,
val text
);
INSERT INTO foo (val)
VALUES ('One'), ('Two'), ('Three');
SELECT f.*, row_number() OVER(ORDER BY val)
FROM foo AS f
ORDER BY val;
Here's an SQL Fiddle which shows this:
http://sqlfiddle.com/#!15/0c434/2
Additional options:
You could count the result with a query of the form:
SELECT count(*)
FROM
(
SELECT *
FROM foo
);
Or you may be able to get the row count back as part of the Postgres library you're using. For example, psycopg2 (Python) and DBI (Perl) allow for this (with some caveats). The library you're using may offer something similar.