PostgreSQL - How to do a Loop on a column - postgresql

I am struggling to do a loop on a Postgres, but functions on postgres are not my piece of cake.
I have the following table on postgres:
| portfolio_1 | total_risk |
|----------------|------------|
| Top 10 Bets | |
| AAPL34 | 2,06699 |
| DISB34 | 1,712684 |
| PETR4 | 0,753324 |
| PETR3 | 0,087767 |
| VALE3 | 0,086346 |
| LREN3 | 0,055108 |
| AMZO34 | 0,0 |
| Bottom 10 Bets | |
| AAPL34 | 0,0 |
What I'm trying to do is get the values after the "Top 10 Bets" and before the "Botton 10 Bets".
My goal is the following result:
| portfolio_1 | total_risk |
|-------------|------------|
| AAPL34 | 2,06699 |
| DISB34 | 1,712684 |
| PETR4 | 0,753324 |
| PETR3 | 0,087767 |
| VALE3 | 0,086346 |
| LREN3 | 0,055108 |
| AMZO34 | 0,0 |
So, my goal is to take off the "Top 10 Bets", the "Botton 10 Bets" and the AAPL34 after the "Botton 10 Bets", which was repeated.
The quantity of rows is variable (I'm importing it from an Excel file), so I need a loop to do this, right?

SQL tables and result sets represent unordered sets. There is no "before" or "after" unless rows explicitly provide that information.
Let me assume that you have such a column, which I will call id for convenience.
Then you can do this in several ways. Here is one:
select t.*
from t
where t.id > (select min(t2.id) from t t2 where t2.portfolio_1 = 'Top 10 Bets') and
t.id < (select max(t2.id) from t t2 where t2.portfolio_1 = 'Bottom 10 Bets');

Related

PostgreSQL: Transforming rows into columns when more than three columns are needed

I have a table like the following one:
+---------+-------+-------+-------------+--+
| Section | Group | Level | Fulfillment | |
+---------+-------+-------+-------------+--+
| A | Y | 1 | 82.2 | |
| A | Y | 2 | 23.2 | |
| A | M | 1 | 81.1 | |
| A | M | 2 | 28.2 | |
| B | Y | 1 | 89.1 | |
| B | Y | 2 | 58.2 | |
| B | M | 1 | 32.5 | |
| B | M | 2 | 21.4 | |
+---------+-------+-------+-------------+--+
And this would be my desired output:
+---------+-------+--------------------+--------------------+
| Section | Group | Level1_Fulfillment | Level2_Fulfillment |
+---------+-------+--------------------+--------------------+
| A | Y | 82.2 | 23.2 |
| A | M | 81.1 | 28.2 |
| B | Y | 89.1 | 58.2 |
| B | M | 32.5 | 21.4 |
+---------+-------+--------------------+--------------------+
Thus, for each section and group I'd like to obtain their percents of fulfillment for level 1 and level 2. To achieve this, I've tried crosstab(), but using this function returns me an error ("The provided SQL must return 3 columns: rowid, category, and values.") because I'm using more than three columns (I need to maintain section and group as identifiers for each row). Is possible to use crosstab in this case?
Regards.
I find crosstab() unnecessary complicated to use and prefer conditional aggregation:
select section,
"group",
max(fulfillment) filter (where level = 1) as level_1,
max(fulfillment) filter (where level = 2) as level_2
from the_table
group by section, "group"
order by section;
Online example

T-SQL : Pivot table without aggregate

I am trying to understand how to pivot data within T-SQL but can't seem to get it working. I have the following table structure
+-------------------+-----------------------+
| Name | Value |
+-------------------+-----------------------+
| TaskId | 12417 |
| TaskUid | XX00044497 |
| TaskDefId | 23 |
| TaskStatusId | 4 |
| Notes | |
| TaskActivityIndex | 0 |
| ModifiedBy | Orange |
| Modified | /Date(1554540200000)/ |
| CreatedBy | Apple |
| Created | /Date(2121212100000)/ |
| TaskPriorityId | 40 |
| OId | 2 |
+-------------------+-----------------------+
I want to pivot the name column to be columns expected output
+--------+------------------------+-----------+--------------+-------+-------------------+------------+-----------------------+-----------+-----------------------+----------------+-----+
| TASKID | TASKUID | TASKDEFID | TASKSTATUSID | NOTES | TASKACTIVITYINDEX | MODIFIEDBY | MODIFIED | CREATEDBY | CREATED | TASKPRIORITYID | OID |
+--------+------------------------+-----------+--------------+-------+-------------------+------------+-----------------------+-----------+-----------------------+----------------+-----+
| | | | | | | | | | | | |
| 12417 | XX00044497 | 23 | 4 | | 0 | Orange | /Date(1554540200000)/ | Apple | /Date(2121212100000)/ | 40 | 2 |
+--------+------------------------+-----------+--------------+-------+-------------------+------------+-----------------------+-----------+-----------------------+----------------+-----+
Is there an easy way of doing it? The columns are fixed (not dynamic).
Any help appreciated
Try this:
select * from yourtable
pivot
(
min(value)
for Name in ([TaskID],[TaskUID],[TaskDefID]......)
) as pivotable
You can also use case statements.
You must use the aggregate function in the pivot table.
If you want to learn more, here is the reference:
https://learn.microsoft.com/en-us/sql/t-sql/queries/from-using-pivot-and-unpivot?view=sql-server-2017
Output (I only tried three columns):
DB<>Fiddle

Add columns but keep a specific id

I have a table "Listing" that looks like this:
| listing_id | amenities |
|------------|--------------------------------------------------|
| 5629709 | {"Air conditioning",Heating, Essentials,Shampoo} |
| 4156372 | {"Wireless Internet",Kitchen,"Pets allowed"} |
And another table "Amenity" like this:
| amenity_id | amenities |
|------------|--------------------------------------------------|
| 1 | Air conditioning |
| 2 | Kitchen |
| 3 | Heating |
Is there a way to join the two tables in a new one "Listing_Amenity" like this:
| listing_id | amenities |
|------------|-----------|
| 5629709 | 1 |
| 5629709 | 3 |
| 4156372 | 2 |
You could use unnest:
CREATE TABLE Listing_Amenity
AS
SELECT l.listing_id, a.amenity_id
FROM Listing l
, unnest(l.ammenities) sub(elem)
JOIN Amenity a
ON a.ammenities = sub.elem;
db<>fiddle demo

how to return number of records as a part of a select statement?

I'd like to know if there is a way to include row numbers (basically telling me how many records I'm getting back from a database query).
I have the following SQL query
SELECT w.widget_id, w.class_id, wg.name classname, wg.label AS classgroup, c.label, c.seq,
g.name AS group, p.name, p.type, CASE WHEN v.value IS NOT NULL THEN v.value WHEN g2p.value IS NOT NULL THEN g2p.value ELSE p.value END AS value
FROM widgets_to_categories w
INNER JOIN widget_classes c ON w.class_id = c.class_id
JOIN classes_to_param_groups t2g ON c.class_id = t2g.class_id
JOIN widget_groups g ON t2g.group_id = g.group_id
JOIN param_groups_to_params g2p ON t2g.group_id = g2p.group_id
JOIN provisioning_params p ON g2p.param_id = p.param_id
INNER JOIN widget_cat_groups wg ON c.class_group_id = wg.class_group_id
LEFT JOIN widget_values v ON(w.widget_id=v.device_id AND p.param_id=v.param_id AND g.name=v.group_name )
WHERE w.widget_id=8 ORDER BY c.class_id ASC
And it returns data like:
widget_id | class_id | classname | classgroup | label | seq | group | name | type | value
8 | 1 | toy | group A | test label | 1 | toy | reg | text | af
8 | 1 | toy | group A | test label | 1 | reg2 | fall | text | 25327
8 | 1 | toy | group A | test label | 1 | reg2 | pd | text | dvaa
8 | 1 | toy | group A | test label | 1 | reg2 | ext | text | 28235
8 | 1 | toy | group A | test label | 1 | reg1 | ext | text | 28230
8 | 1 | toy | group A | test label | 1 | toy | meec | text | 094F22DE501
8 | 1 | toy | group A | test label | 1 | toy | mmap | text | 0|
8 | 1 | toy | group A | test label | 1 | reg1 | fna | text | 26014
8 | 1 | toy | group A | test label | 1 | reg1 | fall | text | t-123
8 | 1 | toy | group A | test label | 1 | toy | uen | boolean | false
8 | 1 | toy | group A | test label | 1 | toy | adminpd |
I'd like to know if there's a way to have the database auto generate and return another column that is just an identifier for the row, like so:
id |widget_id | class_id | classname | classgroup | label | seq | group | name | type | value
1 | 8 | 1 | toy | group A | test label | 1 | toy | reg | text | af
2 | 8 | 1 | toy | group A | test label | 1 | reg2 | fall | text | 25327
3 | 8 | 1 | toy | group A | test label | 1 | reg2 | pd | text | dvaa
4 | 8 | 1 | toy | group A | test label | 1 | reg2 | ext | text | 28235
5 | 8 | 1 | toy | group A | test label | 1 | reg1 | ext | text | 28230
6 | 8 | 1 | toy | group A | test label | 1 | toy | meec | text | 094F22DE501
7 | 8 | 1 | toy | group A | test label | 1 | toy | mmap | text | 0|
8 | 8 | 1 | toy | group A | test label | 1 | reg1 | fna | text | 26014
9 | 8 | 1 | toy | group A | test label | 1 | reg1 | fall | text | t-123
10 | 8 | 1 | toy | group A | test label | 1 | toy | uen | boolean | false
11 | 8 | 1 | toy | group A | test label | 1 | toy | adminpd | boolean | false
I think I can do this by selecting into a temporary table.. I haven't figured out the syntax on how to do it yet... But I'm also wondering if there's another simpler way.
Once I get the data back from the database, having this ID field makes it eaiser to manipulate.
Thanks.
You can use the row_number window function to keep track of each row number.
Like so:
create table foo
(
id serial,
val text
);
INSERT INTO foo (val)
VALUES ('One'), ('Two'), ('Three');
SELECT f.*, row_number() OVER(ORDER BY val)
FROM foo AS f
ORDER BY val;
Here's an SQL Fiddle which shows this:
http://sqlfiddle.com/#!15/0c434/2
Additional options:
You could count the result with a query of the form:
SELECT count(*)
FROM
(
SELECT *
FROM foo
);
Or you may be able to get the row count back as part of the Postgres library you're using. For example, psycopg2 (Python) and DBI (Perl) allow for this (with some caveats). The library you're using may offer something similar.

Optimization of Sybase 15.5 union query

im having trouble trying to optimize the following query on Sybase 15.5. Does anyone know how could i improve it. Each one of the tables used there have about 30 million rows each. I tried my best to optimize it but still taking lot of time(1.5 hours).
create table #tmp1( f_id smallint, a_date smalldatetime )
create table #tmp2( f_id smallint, a_date smalldatetime )
insert #tmp1
select f_id, a_date = max( a_date )
FROM audit_table
WHERE i_date = #pIDate
group by f_id
insert #tmp2
select f_id , a_date = max( a_date )
FROM n_audit_table
WHERE i_date = #pIDate
group by f_id
create table #tmp(
t_account varchar(32) not null,
t_id varchar(32) not null,
product varchar(64) null
)
insert into #tmp
select t_account,t_id, product
FROM audit_table nt, #tmp1 a
WHERE i_date = #pIDate
and nt.a_date = a.a_date
and nt.f_id = a.f_id
union
select t_account,t_id, product
FROM n_audit_table t, #tmp2 a
WHERE t.item_date = #pIDate
and t.a_date = a.a_date
and t.f_id = a.f_id
Both the tables having indexes on i_date, a_date, f_id. Please find below showplan where it is long time.
QUERY PLAN FOR STATEMENT 2 (at line 24).
Optimized using Serial Mode
STEP 1
The type of query is INSERT.
10 operator(s) under root
|ROOT:EMIT Operator (VA = 10)
|
| |INSERT Operator (VA = 9)
| | The update mode is direct.
| |
| | |HASH UNION Operator (VA = 8) has 2 children.
| | | Using Worktable1 for internal storage.
| | | Key Count: 3
| | |
| | | |NESTED LOOP JOIN Operator (VA = 3) (Join Type: Inner Join)
| | | |
| | | | |SCAN Operator (VA = 0)
| | | | | FROM TABLE
| | | | | #tmp1
| | | | | a
| | | | | Table Scan.
| | | | | Forward Scan.
| | | | | Positioning at start of table.
| | | | | Using I/O Size 2 Kbytes for data pages.
| | | | | With LRU Buffer Replacement Strategy for data pages.
| | | |
| | | | |RESTRICT Operator (VA = 2)(5)(0)(0)(0)(0)
| | | | |
| | | | | |SCAN Operator (VA = 1)
| | | | | | FROM TABLE
| | | | | | audit_table
| | | | | | nt
| | | | | | Index : IX_audit_table
| | | | | | Forward Scan.
| | | | | | Positioning by key.
| | | | | | Keys are:
| | | | | | i_date ASC
| | | | | | a_date ASC
| | | | | | Using I/O Size 2 Kbytes for index leaf pages.
| | | | | | With LRU Buffer Replacement Strategy for index leaf pages.
| | | | | | Using I/O Size 2 Kbytes for data pages.
| | | | | | With LRU Buffer Replacement Strategy for data pages.
| | |
| | | |NESTED LOOP JOIN Operator (VA = 7) (Join Type: Inner Join)
| | | |
| | | | |SCAN Operator (VA = 4)
| | | | | FROM TABLE
| | | | | #tmp2
| | | | | a
| | | | | Table Scan.
| | | | | Forward Scan.
| | | | | Positioning at start of table.
| | | | | Using I/O Size 2 Kbytes for data pages.
| | | | | With LRU Buffer Replacement Strategy for data pages.
| | | |
| | | | |RESTRICT Operator (VA = 6)(5)(0)(0)(0)(0)
| | | | |
| | | | | |SCAN Operator (VA = 5)
| | | | | | FROM TABLE
| | | | | | n_audit_table
| | | | | | t
| | | | | | Index : IX_n_audit_table
| | | | | | Forward Scan.
| | | | | | Positioning by key.
| | | | | | Keys are:
| | | | | | i_date ASC
| | | | | | a_date ASC
| | | | | | Using I/O Size 2 Kbytes for index leaf pages.
| | | | | | With LRU Buffer Replacement Strategy for index leaf pages.
| | | | | | Using I/O Size 2 Kbytes for data pages.
| | | | | | With LRU Buffer Replacement Strategy for data pages.
| |
| | TO TABLE
| | #tmp
| | Using I/O Size 2 Kbytes for data pages.
Total estimated I/O cost for statement 2 (at line 24): 29322945.
I doubt its a union issue. Queries are more probable troublemaker.
I suppose you should start from adding indexes on your temp tables:
create table #tmp1( f_id smallint, a_date smalldatetime )
Create clustered index IX1Temp on #tmp1(f_id )
Create clustered index IX2Temp on #tmp1(a_date )
...
Also, I see not much sense in #tmp1, #tmp2 the way you use them. You could call CTE instead. Also. I would recommend you to try PARTITION BY instead GROUP BY statement.
According to the query execution plan, the problem is the table scans on the temporary tables.
Please get the execution plan for the following query:
insert into #tmp
select t_account,t_id, product
FROM
audit_table nt,
(
select f_id, a_date = max(a_date)
FROM audit_table
WHERE i_date = #pIDate
group by f_id
) a
WHERE
i_date = #pIDate
and nt.a_date = a.a_date
and nt.f_id = a.f_id
union
select t_account,t_id, product
FROM
n_audit_table t,
(
select f_id , a_date = max( a_date )
FROM n_audit_table
WHERE i_date = #pIDate
group by f_id
) a
WHERE
t.item_date = #pIDate
and t.a_date = a.a_date
and t.f_id = a.f_id
How many rows end up in each of the temporary tables?
Looks like the temporary tables could be replaced by using HAVING, I would need to test it, it is always complicated when your group by is on a single column and you require more columns in the output.
Try running this statement with SET STATISTICS PLANCOST ON and SET STATISTICS IO ON as that would give a good idea of the number of pages that are scanned and if Sybase is going wrong somewhere while optimising the query.