Perl + PostgreSQL-- Selective Column to Row Transpose - perl

I'm trying to find a way to use Perl to further process a PostgreSQL output. If there's a better way to do this via PostgreSQL, please let me know. I basically need to choose certain columns (Realtime, Value) in a file to concatenate certains columns to create a row while keeping ID and CAT.
First time posting, so please let me know if I missed anything.
Input:
ID CAT Realtime Value
A 1 time1 55
A 1 time2 57
B 1 time3 75
C 2 time4 60
C 3 time5 66
C 3 time6 67
Output:
ID CAT Time Values
A 1 time 1,time2 55,57
B 1 time3 75
C 2 time4 60
C 3 time5,time6 66,67

You could do this most simply in Postgres like so (using array columns)
CREATE TEMP TABLE output AS SELECT
id, cat, ARRAY_AGG(realtime) as time, ARRAY_AGG(value) as values
FROM input GROUP BY id, cat;
Then select whatever you want out of the output table.

SELECT id
, cat
, string_agg(realtime, ',') AS realtimes
, string_agg(value, ',') AS values
FROM input
GROUP BY 1, 2
ORDER BY 1, 2;
string_agg() requires PostgreSQL 9.0 or later and concatenates all values to a delimiter-separated string - while array_agg() (v8.4+) creates am array out of the input values.
About 1, 2 - I quote the manual on the SELECT command:
GROUP BY clause
expression can be an input column name, or the name or ordinal number
of an output column (SELECT list item), or ...
ORDER BY clause
Each expression can be the name or ordinal number of an output column
(SELECT list item), or
Emphasis mine. So that's just notational convenience. Especially handy with complex expressions in the SELECT list.

Related

select maximum column name from different table in a database

I am comparing from different table to get the COLUMN_NAME of the MAXIMUM value
Examples.
These are example tables: Fruit_tb, Vegetable_tb, State_tb, Foods_tb
Under Fruit_tb
fr_id fruit_one fruit_two
1 20 50
Under Vegetables_tb (v = Vegetables)
v_id v_one V_two
1 10 9
Under State_tb
stateid stateOne stateTwo
1 70 87
Under Food_tb
foodid foodOne foodTwo
1 10 3
Now here is the scenario, I want to get the COLUMN NAMES of the max or greatest value in each table.
You can maybe find out the row which contains the max value of a column. For eg:
SELECT fr_id , MAX(fruit_one) FROM Fruit_tb GROUP BY fr_id;
In order to find the out the max value of a table:
SELECT fr_id ,fruit_one FROM Fruit_tb WHERE fruit_one<(SELECT max(fruit_one ) from Fruit_tb) ORDER BY fr_id DESC limit 1;
A follow up SO for the above scenario.
Maybe you can use GREATEST in order to get the column name which has the max value. But then what I'm not sure is whether you'll be able to retrieve all the columns of different tables at once. You can do something like this to retrieve from a single table:
SELECT CASE GREATEST(`id`,`fr_id`)
WHEN `id` THEN `id`
WHEN `fr_id` THEN `fr_id`
ELSE 0
END AS maxcol,
GREATEST(`id`,`fr_id`) as maxvalue FROM Fruit_tb;
Maybe this SO could help you. Hope it helps!

Putting keyword data into a csv file MATLAB

Given a table of the following format in MATLAB:
userid | itemid | keywords
A = [ 3 10 'book'
3 10 'briefcase'
3 10 'boat'
12 20 'windows'
12 20 'picture'
12 35 'love'
4 10 'day'
12 10 'working day'
... ... ... ];
where A is a table of size (58000*3), I want to write the data in a csv file with the following format:
csv.file
itemid keywords
10 book, briefcase, boat, day, working day, ...
20 windows, picture, ...
35 love, ...
where we the list of itemids is stored in Iids = [10,20,35,...]
I would like to avoide using loops for this as you can imagine the matrix is big-sized. Any idea is appreciated.
I wasn't able to think of a solution without loops. But you can optimize your loop by:
using logical indexing
running such loop only M times (if M is the number of unique itemid elements) instead of N times (if N is the number of elements in your table).
The solution I come up with is this.
First of all, create your table
A=table([3;3;3;12;12;12;4;12], [10;10;10;20;20;35;10;10],{'book','briefcase','boat','windows','picture','love','day','working day'}','VariableNames',{'userid','itemid','keywords'});
which looks like
Select the unique values for column itemid (your Iids):
Iids=unique(A.itemid);
which looks like
Create a new, empty, table which will contain the results:
NewTable=table();
And now the minimal loop I've come up with:
for id=Iids'
% select rows with given itemid value
RowsWithGivenId=A(A.itemid==id,:);
% create new row in NewTable with the id and the (joined together) keywords from the selected rows
NewTable=[NewTable; table(id,{strjoin(RowsWithGivenId.keywords,', ')})];
end
Also, append the new column names in NewTable
NewTable.Properties.VariableNames = {'itemid','keywords'};
And now NewTable looks like:
Please note: due to the fact that the keywords in the new table are separated by comma, a csv file is not the format I recommend. By using writetable() as writetable(NewTable,'myfile.csv');
what you'll get is
As instead, by replacing ; instead of a separating comma (in strjoin()), you'll get a nicer format:

Substring SQL Select statement

I have a number of references with a length of 20 and I need to remove the 1st 12 numbers, replace with a G and select the next 7 numbers
An example of the format of the numbers being received
50125426598525412584
I then need to remove first 12 digits and select the next 7 (not including the last)
2541258
Lastly I need to put a G in front of the number so I'm left with
G25412584
My SQL is as follows:
SELECT SUBSTRING(ref, 12, 7) AS ref
FROM mytable
WHERE ref LIKE '5012%'
The results of this will leave me with
25412584
But how do I insert the G in front of the number in the same SQL statement?
Many thanks
SELECT 'G'+SUBSTRING(ref, 12, 7) AS ref FROM mytable where ref like '5012%'
SELECT CONCAT( 'G', SUBSTRING('50125426598525412584', 13,7)) from dual;

how to get grouped query data from the resultset?

I want to get grouped data from a table in sqlite. For example, the table is like below:
Name Group Price
a 1 10
b 1 9
c 1 10
d 2 11
e 2 10
f 3 12
g 3 10
h 1 11
Now I want get all data grouped by the Group column, each group in one array, namely
array1 = {{a,1,10},{b,1,9},{c,1,10},{h,1,11}};
array2 = {{d,2,11},{e,2,10}};
array3 = {{f,3,12},{g,3,10}}.
Because i need these 2 dimension arrays to populate the grouped table view. the sql statement maybe NSString *sql = #"SELECT * FROM table GROUP BY Group"; But I wonder how to get the data from the resultset. I am using the FMDB.
Any help is appreciated.
Get the data from sql with a normal SELECT statement, ordered by group and name:
SELECT * FROM table ORDER BY group, name;
Then in code, build your arrays, switching to fill the next array when the group id changes.
Let me clear about GroupBy. You can group data but that time its require group function on other columns.
e.g. Table has list of students in which there are gender group mean Male & Female group so we can group this table by Gender which will return two set . Now we need to perform some operation on result column.
e.g. Maximum marks or Average marks of each group
In your case you want to group but what kind of operation you require on price column ?.
e.g. below query will return group with max price.
SELECT Group,MAX(Price) AS MaxPriceByEachGroup FROM TABLE GROUP BY(group)

Postgresql sorting mixed alphanumeric data

Running this query:
select name from folders order by name
returns these results:
alphanumeric
a test
test 20
test 19
test 1
test 10
But I expected:
a test
alphanumeric
test 1
test 10
test 19
test 20
What's wrong here?
You can simply cast name column to bytea data type allowing collate-agnostic ordering:
SELECT name
FROM folders
ORDER BY name::bytea;
Result:
name
--------------
a test
alphanumeric
test 1
test 10
test 19
test 20
(6 rows)
All of this methods sorted my selection in alphabetical order:
test 1
test 10
test 2
test 20
This solution worked for me (lc_collate: 'ru_RU.UTF8'):
SELECT name
FROM folders
ORDER BY SUBSTRING(name FROM '([0-9]+)')::BIGINT ASC, name;
test 1
test 2
test 10
test 20
select * from "public"."directory" where "directoryId" = 17888 order by
COALESCE(SUBSTRING("name" FROM '^(\d+)')::INTEGER, 99999999),
SUBSTRING("name" FROM '[a-zA-z_-]+'),
COALESCE(SUBSTRING("name" FROM '(\d+)$')::INTEGER, 0),
"name";
NOTE: Escape the regex as you need, in some languages, you will have to add one more "\".
In my Postgres DB, name column contains following, when I use simple order by name query:
1
10
2
21
A
A1
A11
A5
B
B2
B22
B3
M 1
M 11
M 2
Result of Query, After I have modified it:
1
2
10
21
A
A1
A5
A11
B
B2
B3
B22
M 1
M 2
M 11
You may be able to manually sort by splitting the text up in case there is trailing numerals, like so:
SELECT * FROM sort_test
ORDER BY SUBSTRING(text FROM '^(.*?)( \\d+)?$'),
COALESCE(SUBSTRING(text FROM ' (\\d+)$')::INTEGER, 0);
This will sort on column text, first by all characters optionally excluding an ending space followed by digits, then by those optional digits.
Worked well in my test.
Update fixed the string-only sorting with a simple coalesce (duh).
OverZealous answer helped me but didn't work if the string in the database begun with numbers followed by additional characters.
The following worked for me:
SELECT name
FROM folders
ORDER BY
COALESCE(SUBSTRING(name FROM '^(\\d+)')::INTEGER, 99999999),
SUBSTRING(name FROM '^\\d* *(.*?)( \\d+)?$'),
COALESCE(SUBSTRING(name FROM ' (\\d+)$')::INTEGER, 0),
name;
So this one:
Extracts the first number in the string, or uses 99999999.
Extracts the string that follows the possible first number.
Extracts a trailing number, or uses 0.
A Vlk's answer above helped me a lot, but it sorted items only by the numeric part, which in my case came second. My data was like (desk 1, desk 2, desk 3 ...) a string part, a space and a numeric part. The syntax in A Vlk's answer returned the data sorted by the number, and at that it was the only answer from the above that did the trick. However when the string part was different, (eg desk 3, desk 4, table 1, desk 5...) table 1 would get first from desk 2. I fixed this using the syntax below:
...order by SUBSTRING(name,'\\w+'), SUBSTRINGname FROM '([0-9]+)')::BIGINT ASC;
Tor's last SQL worked for me. However if you are calling this code from php you need add extra slashes.
SELECT name
FROM folders
ORDER BY
COALESCE(SUBSTRING(name FROM '^(\\\\d+)')::INTEGER, 99999999),
SUBSTRING(name FROM '^\\\\d* *(.*?)( \\\\d+)?$'),
COALESCE(SUBSTRING(name FROM ' (\\\\d+)$')::INTEGER, 0),
name;