PostgreSQL: Transforming rows into columns when more than three columns are needed - postgresql

I have a table like the following one:
+---------+-------+-------+-------------+--+
| Section | Group | Level | Fulfillment | |
+---------+-------+-------+-------------+--+
| A | Y | 1 | 82.2 | |
| A | Y | 2 | 23.2 | |
| A | M | 1 | 81.1 | |
| A | M | 2 | 28.2 | |
| B | Y | 1 | 89.1 | |
| B | Y | 2 | 58.2 | |
| B | M | 1 | 32.5 | |
| B | M | 2 | 21.4 | |
+---------+-------+-------+-------------+--+
And this would be my desired output:
+---------+-------+--------------------+--------------------+
| Section | Group | Level1_Fulfillment | Level2_Fulfillment |
+---------+-------+--------------------+--------------------+
| A | Y | 82.2 | 23.2 |
| A | M | 81.1 | 28.2 |
| B | Y | 89.1 | 58.2 |
| B | M | 32.5 | 21.4 |
+---------+-------+--------------------+--------------------+
Thus, for each section and group I'd like to obtain their percents of fulfillment for level 1 and level 2. To achieve this, I've tried crosstab(), but using this function returns me an error ("The provided SQL must return 3 columns: rowid, category, and values.") because I'm using more than three columns (I need to maintain section and group as identifiers for each row). Is possible to use crosstab in this case?
Regards.

I find crosstab() unnecessary complicated to use and prefer conditional aggregation:
select section,
"group",
max(fulfillment) filter (where level = 1) as level_1,
max(fulfillment) filter (where level = 2) as level_2
from the_table
group by section, "group"
order by section;
Online example

Related

Replace null by negative id number in not consecutive rows in hive

I have this table in my database:
| id | desc |
|-------------|
| 1 | A |
| 2 | B |
| NULL | C |
| 3 | D |
| NULL | D |
| NULL | E |
| 4 | F |
---------------
And I want to transform this table into a table that replace nulls by consecutive negative ids:
| id | desc |
|-------------|
| 1 | A |
| 2 | B |
| -1 | C |
| 3 | D |
| -2 | D |
| -3 | E |
| 4 | F |
---------------
Anyone knows how can I do this in hive?
Below approach works
select coalesce(id,concat('-',ROW_NUMBER() OVER (partition by id))) as id,desc from database_name.table_name;

PostgreSQL - How to do a Loop on a column

I am struggling to do a loop on a Postgres, but functions on postgres are not my piece of cake.
I have the following table on postgres:
| portfolio_1 | total_risk |
|----------------|------------|
| Top 10 Bets | |
| AAPL34 | 2,06699 |
| DISB34 | 1,712684 |
| PETR4 | 0,753324 |
| PETR3 | 0,087767 |
| VALE3 | 0,086346 |
| LREN3 | 0,055108 |
| AMZO34 | 0,0 |
| Bottom 10 Bets | |
| AAPL34 | 0,0 |
What I'm trying to do is get the values after the "Top 10 Bets" and before the "Botton 10 Bets".
My goal is the following result:
| portfolio_1 | total_risk |
|-------------|------------|
| AAPL34 | 2,06699 |
| DISB34 | 1,712684 |
| PETR4 | 0,753324 |
| PETR3 | 0,087767 |
| VALE3 | 0,086346 |
| LREN3 | 0,055108 |
| AMZO34 | 0,0 |
So, my goal is to take off the "Top 10 Bets", the "Botton 10 Bets" and the AAPL34 after the "Botton 10 Bets", which was repeated.
The quantity of rows is variable (I'm importing it from an Excel file), so I need a loop to do this, right?
SQL tables and result sets represent unordered sets. There is no "before" or "after" unless rows explicitly provide that information.
Let me assume that you have such a column, which I will call id for convenience.
Then you can do this in several ways. Here is one:
select t.*
from t
where t.id > (select min(t2.id) from t t2 where t2.portfolio_1 = 'Top 10 Bets') and
t.id < (select max(t2.id) from t t2 where t2.portfolio_1 = 'Bottom 10 Bets');

Combine multiple columns to yield unique values

I'm trying to use Tableau (v10.1) to combine 5 separate columns and get a count of the distinct values for that combination. Some rows/columns are empty. For example:
+-------+-------+-------+-------+-------+
| Tag 1 | Tag 2 | Tag 3 | Tag 4 | Tag 5 |
+-------+-------+-------+-------+-------+
| A | B | C | D | E |
| B | D | E | - | - |
| - | - | - | - | - |
| E | A | - | - | - |
+-------+-------+-------+-------+-------+
I want to obtain the following in a Tableau worksheet:
+-----+-------+
| Tag | Count |
+-----+-------+
| E | 3 |
| A | 2 |
| B | 2 |
| D | 2 |
| C | 1 |
+-----+-------+
I would like to do this in Tableau (using calculated fields, etc.) and not change the original data source.
Click on the data source tab, select the five fields named Tag # and then use the pivot command to reshape the data without changing the original source

how to return number of records as a part of a select statement?

I'd like to know if there is a way to include row numbers (basically telling me how many records I'm getting back from a database query).
I have the following SQL query
SELECT w.widget_id, w.class_id, wg.name classname, wg.label AS classgroup, c.label, c.seq,
g.name AS group, p.name, p.type, CASE WHEN v.value IS NOT NULL THEN v.value WHEN g2p.value IS NOT NULL THEN g2p.value ELSE p.value END AS value
FROM widgets_to_categories w
INNER JOIN widget_classes c ON w.class_id = c.class_id
JOIN classes_to_param_groups t2g ON c.class_id = t2g.class_id
JOIN widget_groups g ON t2g.group_id = g.group_id
JOIN param_groups_to_params g2p ON t2g.group_id = g2p.group_id
JOIN provisioning_params p ON g2p.param_id = p.param_id
INNER JOIN widget_cat_groups wg ON c.class_group_id = wg.class_group_id
LEFT JOIN widget_values v ON(w.widget_id=v.device_id AND p.param_id=v.param_id AND g.name=v.group_name )
WHERE w.widget_id=8 ORDER BY c.class_id ASC
And it returns data like:
widget_id | class_id | classname | classgroup | label | seq | group | name | type | value
8 | 1 | toy | group A | test label | 1 | toy | reg | text | af
8 | 1 | toy | group A | test label | 1 | reg2 | fall | text | 25327
8 | 1 | toy | group A | test label | 1 | reg2 | pd | text | dvaa
8 | 1 | toy | group A | test label | 1 | reg2 | ext | text | 28235
8 | 1 | toy | group A | test label | 1 | reg1 | ext | text | 28230
8 | 1 | toy | group A | test label | 1 | toy | meec | text | 094F22DE501
8 | 1 | toy | group A | test label | 1 | toy | mmap | text | 0|
8 | 1 | toy | group A | test label | 1 | reg1 | fna | text | 26014
8 | 1 | toy | group A | test label | 1 | reg1 | fall | text | t-123
8 | 1 | toy | group A | test label | 1 | toy | uen | boolean | false
8 | 1 | toy | group A | test label | 1 | toy | adminpd |
I'd like to know if there's a way to have the database auto generate and return another column that is just an identifier for the row, like so:
id |widget_id | class_id | classname | classgroup | label | seq | group | name | type | value
1 | 8 | 1 | toy | group A | test label | 1 | toy | reg | text | af
2 | 8 | 1 | toy | group A | test label | 1 | reg2 | fall | text | 25327
3 | 8 | 1 | toy | group A | test label | 1 | reg2 | pd | text | dvaa
4 | 8 | 1 | toy | group A | test label | 1 | reg2 | ext | text | 28235
5 | 8 | 1 | toy | group A | test label | 1 | reg1 | ext | text | 28230
6 | 8 | 1 | toy | group A | test label | 1 | toy | meec | text | 094F22DE501
7 | 8 | 1 | toy | group A | test label | 1 | toy | mmap | text | 0|
8 | 8 | 1 | toy | group A | test label | 1 | reg1 | fna | text | 26014
9 | 8 | 1 | toy | group A | test label | 1 | reg1 | fall | text | t-123
10 | 8 | 1 | toy | group A | test label | 1 | toy | uen | boolean | false
11 | 8 | 1 | toy | group A | test label | 1 | toy | adminpd | boolean | false
I think I can do this by selecting into a temporary table.. I haven't figured out the syntax on how to do it yet... But I'm also wondering if there's another simpler way.
Once I get the data back from the database, having this ID field makes it eaiser to manipulate.
Thanks.
You can use the row_number window function to keep track of each row number.
Like so:
create table foo
(
id serial,
val text
);
INSERT INTO foo (val)
VALUES ('One'), ('Two'), ('Three');
SELECT f.*, row_number() OVER(ORDER BY val)
FROM foo AS f
ORDER BY val;
Here's an SQL Fiddle which shows this:
http://sqlfiddle.com/#!15/0c434/2
Additional options:
You could count the result with a query of the form:
SELECT count(*)
FROM
(
SELECT *
FROM foo
);
Or you may be able to get the row count back as part of the Postgres library you're using. For example, psycopg2 (Python) and DBI (Perl) allow for this (with some caveats). The library you're using may offer something similar.

How to compute the dot product of two column (think full column as a vector)?

gave this table:
| a | b | c |
|---+---+----+
| 3 | 4 | |
| 1 | 2 | |
| 1 | 3 | |
| 2 | 2 | |
I want to get the dot product of two column a and b ,the result should be equel to (3*4)+(1*2)+(1*3)+(2*2) which is 21.
I don't want use the clumsy formula (B1*B2+C1*C2+D1*D2+E1*E2) because actually I have a large table waiting to calculate.
I know emacs's Calc tool has a "vprod" function which can do those sort of things ,but I dont' know how to turn the full column to a vector.
Can anybody tell me how to achieve this task,appreciate it!
In emacs-calc, the simple product of 2 vectors calculates the dot product.
This works (I put the result in #6$3; also the parenthesis can be omitted):
| a | b | c |
|---+---+----|
| 3 | 4 | |
| 1 | 2 | |
| 1 | 3 | |
| 2 | 2 | |
|---+---+----|
| | | 21 |
#+TBLFM: #6$3=(#I$1..#II$1)*(#I$2..#II$2)
#I and #II span from the 1st hline to the second.
This can be solved using babel and R in org-mode:
#+name: mytable
| a | b | c |
|---+---+----+
| 3 | 4 | |
| 1 | 2 | |
| 1 | 3 | |
| 3 | 2 | |
#+begin_src R :var mytable=mytable
sum(mytable$a * mytable$b)
#+end_src
#+RESULTS:
: 23