Call a function for each row in select - Postgres - postgresql

I have a function called "getList(date)". This function returns me a items list (with several columns) from the date inputted in the parameter.
If I call:
SELECT * FROM getList('12/31/2014');
It works fine. It returns me a list with the date, the item name and the price.
Something like this:
date item_description price
-----------------------------------------------
12/31/2014 banana 1
12/31/2014 apple 2.5
12/31/2014 coconut 3
But I have another table with the dates that I want to search for.
So, I want to select all the dates from that table and, for each row returned, I want to call my function "getList" to have a result like this:
date item_description price
-----------------------------------------------
12/28/2014 banana 0.5
12/28/2014 apple 1.5
12/28/2014 coconut 2
12/31/2014 banana 1
12/31/2014 apple 2.5
12/31/2014 coconut 3
I don't know exactly how to do it. Of course my data is not a fruit list. This is just to explain the whole thing easier.
Thank you very much.

Correct way - LATERAL join
The correct way to do this is with a lateral query (PostgreSQL 9.3 or newer):
SELECT d."date", f.item_description, f.price
FROM mydates d,
LATERAL getList(d."date") f;
See the manual.
Legacy way - SRF in SELECT
In older versions you must use a PostgreSQL extension with some ... quirky ... properties, support for set-returning functions in the SELECT-list. Do not use this unless you know you must support PostgreSQL 9.2 or older.
SELECT d."date", (getList(d."date").*
FROM mydates d;
This may result in multiple-evaluation of the getList function, once for each column of the output.

Related

kdb - how to augment table with missing dates in a dynamic/fast way

I have an in-memory table with (date, sym, symType, factor, weight) as columns.
There are cases where this in-memory table once queried for a particular date range is missing an entire date. Could be today's data, or if we're querying for multiple dates, could be a day in the middle, or perhaps multiple days, or the last date, or the beginning.
How can I come up with with a query that fills in those missing dates with the max date up to that point?
So if we have data as follows:
Examples:
.z.D
.z.D-2
.z.D-3
.z.D-6
.z.D-7
I'd like the table to look like this:
.z.D -> .z.D
.z.D-1 -> copy of .z.D-2
.z.D-2 -> .z.D-2
.z.D-3 -> .z.D-3
.z.D-4 -> copy of .z.D-6
.z.D-5 -> copy .z.d-6
.z.D-6 -> .z.D-6
.z.D-7 -> .z.D-7
If in your query today is missing, use previous available date as today.
If in your query the last day is yesterday and it's missing, use the the previous available day as yesterday and so on.
if your last (min date) is missing, use the next available date upwards.
I can do this manually by identifying missing dates and going through missing dates day by day, but I'm wondering if there's a much better way to do this.
aj can work for dates in the middle by constructing a ([] date: listofdesireddates) cross ([] sym: listofsyms) cross ([] sectors: symtype) and then do an aj with the table but it doesn't solve all cases e.g if the missing day is today or at the start.
Can you come up with a reproducible example as to why aj doesn't work? Normal aj usage should solve this problem:
t1:([]date:.z.D-til 8;sym:`ABC);
t2:`date xasc([]date:.z.D-0 2 3 6 7;sym:`ABC;data:"I"$ssr[;".";""]each string .z.D-0 2 3 6 7);
q)aj[`sym`date;t1;t2]
date sym data
-----------------------
2020.07.20 ABC 20200720
2020.07.19 ABC 20200718
2020.07.18 ABC 20200718
2020.07.17 ABC 20200717
2020.07.16 ABC 20200714
2020.07.15 ABC 20200714
2020.07.14 ABC 20200714
2020.07.13 ABC 20200713
/If you need your last date to fill "upwards" then use fills:
update fills data by sym from aj[`sym`date;([]date:.z.D-til 9;sym:`ABC);t2]
A quick guess but a step function with xgroup on the result seems like it will work.
res:getFromTab[dates];
f:{`date xcols:update date:x from y#x};
xgrp:`s#`date xasc `date xgroup res;
raze f[;xgrp] each dates
Performance might be horrible ...

how to use the age() function in postgresql

I have a column in the students table called birthdate. i need to find students over the age of 12.
select ......, age(timestamp 'birthdate') as StudentAge
from students
.....
where StudentAge > 11
I dont know if thats the proper syntax or if im using the correct function for the situation
I think most of your confusion comes from unfamiliarity with Postgres's rich type system, and the syntax it uses.
In the page on date/time functions, the age function is listed with two forms. Assuming you want to compare to "today", you want the form with a single argument:
Function: age(timestamp)
Return type: interval
Description: Subtract from current_date (at midnight)
Example: age(timestamp '1957-06-13')
Result: 43 years 8 mons 3 days
So, you have a function which takes a value of type timestamp, and returns a value of type interval.
The example shows the input being specified as timestamp '1957-06-13'; this is just a way of creating a value of type timestamp from a hard-coded value - like creating an object in an object-oriented language. In your query, birthdate is not a hard-coded value, it's the name of a column, so this is not the syntax you want. If the column is of type timestamp, you can just use age(birthdate) directly; if not, you might need to convert it, e.g. age(CAST(birthdate AS timestamp)).
The output is of type interval, not a number of years, so comparing it against 12 is unlikely to do what you want. Instead, you should compare it against another interval value. Similar to the timestamp '1957-06-13' example, you can write interval '12 years' to directly create an interval value representing 12 years.
So your comparison would look like age(birthdate) >= interval '12 years'.
I don't know that tutorial you are talking about, but the documentation has the following to say about column labels:
The entries in the select list can be assigned names for subsequent processing, such as for use in an ORDER BY clause or for display by the client application.
Observe the subsequent here: The SELECT list is (logically) processed after the WHERE clause, so you cannot use column labels there.
You'll have to repeat the expression. This is in accordance with the SQL standard.
Moreover, birthdate is not a string literal, so don't quote it. And remove the timestamp.

How to optimize a batch pivotization?

I have a datetime list (which for some reason I call it column date) containing over 1k datetime.
adates:2017.10.20T00:02:35.650 2017.10.20T01:57:13.454 ...
For each of these dates I need to select the data from some table, then pivotize by a column t i.e. expiry, add the corresponding date datetime as column to the pivotized table and stitch together the pivotization for all the dates. Note that I should be able to identify which pivotization corresponds to a date and that's why I do it one by one:
fPivot:{[adate;accypair]
t1:select from volatilitysurface_smile where date=adate,ccypair=accypair;
mycols:`atm`s10c`s10p`s25c`s25p;
t2:`t xkey 0!exec mycols#(stype!mid) by t:t from t1;
t3:`t xkey select distinct t,tenor,xi,volofvol,delta_type,spread from t1;
result:ej[`t;t2;t3];
:result}
I then call this function for every datetime adates as follows:
raze {[accypair;adate] `date xcols update date:adate from fPivot[adate;accypair] }[`EURCHF] #/: adates;
this takes about 90s. I wonder if there is a better way e.g. do a big pivotization rather than running one pivotization per date and then stitching it all together. The big issue I see is that I have no apparent way to include the date attribute as part of the pivotization and the date can not be lost otherwise I can't reconciliate the results.
If you havent been to the wiki page on pivoting then it may be a good start. There is a section on a general pivoting function that makes some claims to being somewhat efficient:
One user reports:
This is able to pivot a whole day of real quote data, about 25 million
quotes over about 4000 syms and an average of 5 levels per sym, in a
little over four minutes.
As for general comments, I would say that the ej is unnecessary as it is a more general version of ij, allowing you to specify the key column. As both t2 and t3 have the same keying I would instead use:
t2 ij t3
Which may give you a very minor performance boost.
OK I solved the issue by creating a batch version of the pivotization that keeps the date (datetime) table field when doing the group by bit needed to pivot i.e. by t:t from ... to by date:date,t:t from .... It went from 90s down to 150 milliseconds.
fBatchPivot:{[adates;accypair]
t1:select from volatilitysurface_smile where date in adates,ccypair=accypair;
mycols:`atm`s10c`s10p`s25c`s25p;
t2:`date`t xkey 0!exec mycols#(stype!mid) by date:date,t:t from t1;
t3:`date`t xkey select distinct date,t,tenor,xi,volofvol,delta_type,spread from t1;
result:0!(`date`t xasc t2 ij t3);
:result}

Grouping by date difference/range

How would i write a statement that would make specific group by's looking at the monthly date range/difference. Example:
org_group | date | second_group_by
A 30.10.2013 1
A 29.11.2013 1
A 31.12.2013 1
A 30.01.2015 2
A 27.02.2015 2
A 31.03.2015 2
A 30.04.2015 2
as long es there isnt a monthly date_diff > 1 it should be in the same second_group_by. I hope its clear enough for you to understand, the column second_group_by should be generated by the user...it doesnt exists in the table.
date diff between which rows though?
If you just want to separate years (or months or weeks) use
GROUP BY DATEPART(....)
That's Sybase or SQL Server but other SQLs will have equivalent.
If you have specific data ranges, get them into a table with start and end date-time and a monotonically increasing integer, join to that with a BETWEEN and GROUP BY the integer.

Convert PostgreSQL age function output to upper case

I am working with PostgreSQL 8.4.4. I am calculating time difference between two Unix time-stamps using PostgreSQL's age function. I am getting the output as expected. The only thing I want is to convert the time difference in UPPERCASE.
For example,
select coalesce(nullif(age(to_timestamp(1389078075), to_timestamp(1380703432))::text,''), UPPER('Missing')) FROM transactions_transactions WHERE id = 947
This query giving the result as
3 mons 4 days 22:17:23
But I want this output to be like
3 MONTHS 4 DAYS 22:17:23
Note: I am using this for dynamic report generation purpose. So I cannot convert it to UPPERCASE after fetching from database. I want it to be in UPPERCASE at the time of coming from database itself, i.e., in the query.
PostgreSQL's upper() function should be use
SELECT upper(age(to_timestamp(1389078075), to_timestamp(1380703432))::text)
FROM transactions_transactions WHERE id = 947
as per OP's comment and edit
select upper(coalesce(nullif(age(to_timestamp(1389078075), to_timestamp(1380703432))::text,''), UPPER('Missing')))