Does SQL have a way to group rows without squashing the group into a single row? - group-by

I want to do a single query that outputs an array of arrays of table rows. Think along the lines of <table><rowgroup><tr><tr><tr><rowgroup><tr><tr>. Is SQL capable of this? (specifically, as implemented in MariaDB, though migration to AWS RDS might occur one day)
The GROUP BY statement alone does not do this, it creates one row per group.
Here's an example of what I'm thinking of…
SELECT * FROM memes;
+------------+----------+
| file_name | file_ext |
+------------+----------+
| kittens | jpeg |
| puppies | gif |
| cats | jpeg |
| doggos | mp4 |
| horses | gif |
| chickens | gif |
| ducks | jpeg |
+------------+----------+
SELECT * FROM memes GROUP BY file_ext WITHOUT COLLAPSING GROUPS;
+------------+----------+
| file_name | file_ext |
+------------+----------+
| kittens | jpeg |
| cats | jpeg |
| ducks | jpeg |
+------------+----------+
| puppies | gif |
| horses | gif |
| chickens | gif |
+------------+----------+
| doggos | mp4 |
+------------+----------+
I've been using MySQL for ~20 years and have not come across this functionality before but maybe I've just been looking in the wrong place ¯\_(ツ)_/¯

I haven't seen an array rendering such as the one you want, but you can simulate it with multiple GROUP BY / GROUP_CONCAT() clauses.
For example:
select concat('[', group_concat(g), ']') as a
from (
select concat('[', group_concat(file_name), ']') as g
from memes
group by file_ext
) x
Result:
a
---------------------------------------------------------
[[puppies,horses,chickens],[kittens,cats,ducks],[doggos]]
See running example at DB Fiddle.
You can tweak the delimiters such as ,, [, and ].

SELECT ... ORDER BY file_ext will come close to your second output.
Using GROUP BY ... WITH ROLLUP would let you do subtotals under each group, which is not what you wanted either, but it would give you extra lines where you want the breaks.

Related

Flatten Postgers left join query result with dynamic values into one row

I have two tables products and product_attributs. One Product can have one or many attributs and these are filled by a dynamic web form (name and value inputs) added by the user as needed. For example for a drill the user could decide to add two attributs : color=blue and power=100 watts. For another product it could be 3 or more different attribus and for another it could have no special attributs.
products
| id | name | identifier | identifier_type | active
| ----------|--------------|-------------|------------------|---
| 1 | Drill | AD44 | barcode | true
| 2 | Polisher | AP211C | barcode | true
| 3 | Jackhammer | AJ2133 | barcode | false
| 4 | Screwdriver | AS4778 | RFID | true
product_attributs
|id | name | value | product_id
|----------|--------------|-------------|----------
|1 | color | blue | 1
|2 | power | 100 watts | 1
|3 | size | 40 cm | 2
|4 | energy | electrical | 3
|4 | price | 35€ | 3
so attributs could be anything which are set dynamically by the user. My need is to generate a report on CSV which contain all products with their attributs. Without a good experience in SQL I generated the following basic request :
SELECT pr.name, pr.identifier_type, pr.identifier, pr.active, att.name, att.value
FROM products as pr
LEFT JOIN product_attributs att ON pr.id = att.product_id
as you know the result will contain for the same product as many rows as attributs it has and this is not ideal for reporting. The ideal would be this :
|name | identifier_type | identifier | active | name | value | name | value
|-----------|-----------------|------------|--------|--------|-------|------ |------
|Drill | barcode | AD44 | true | color | blue | power | 100 w
|Polisher | barcode | AP211C | true | size | 40 cm | null | null
|Jackhammer | barcode | AJ2133 | true | energy | elect | price | 35 €
|Screwdriver| barcode | AS4778 | true | null | null | null | null
here I only showed a max of two attributes per product but it could be more if needed. Well I did some research and came across the pivot with crosstab function on Postgres but the problem it requests static values but this does not match my need.
thanks lot for your help and sorry for duplicates if any.
Thanks Laurenz Albe for your help. array_agg solved my problem. Here is the query if someone may be interested in :
SELECT
pr.name, pr.description, pr.identifier_type, pr.identifier,
pr.internal_identifier, pr.active,
ARRAY_TO_STRING(ARRAY_AGG (oa.name || ' = ' || oa.value),', ') attributs
FROM
products pr
LEFT JOIN product_attributs oa ON pr.id = oa.product_id
GROUP BY
pr.name, pr.description, pr.identifier_type, pr.identifier,
pr.internal_identifier, pr.active
ORDER BY
pr.name;

SPSS group by rows and concatenate string into one variable

I'm trying to export SPSS metadata to a custom format using SPSS syntax. The dataset with value labels contains one or more labels for the variables.
However, now I want to concatenate the value labels into one string per variable. For example for the variable SEX combine or group the rows F/Female and M/Male into one variable F=Female;M=Male;. I already concatenated the code and labels into a new variable using Compute CodeValueLabel = concat(Code,'=',ValueLabel).
so the starting point for the source dataset is like this:
+--------------+------+----------------+------------------+
| VarName | Code | ValueLabel | CodeValueLabel |
+--------------+------+----------------+------------------+
| SEX | F | Female | F=Female |
| SEX | M | Male | M=Male |
| ICFORM | 1 | Yes | 1=Yes |
| LIMIT_DETECT | 0 | Too low | 0=Too low |
| LIMIT_DETECT | 1 | Normal | 1=Normal |
| LIMIT_DETECT | 2 | Too high | 2=Too high |
| LIMIT_DETECT | 9 | Not applicable | 9=Not applicable |
+--------------+------+----------------+------------------+
The goal is to get a dataset something like this:
+--------------+-------------------------------------------------+
| VarName | group_and_concatenate |
+--------------+-------------------------------------------------+
| SEX | F=Female;M=Male; |
| ICFORM | 1=Yes; |
| LIMIT_DETECT | 0=Too low;1=Normal;2=Too high;9=Not applicable; |
+--------------+-------------------------------------------------+
I tried using CASESTOVARS but that creates separate variables, so several variables not just one single string variable. I'm starting to suspect that I'm running up against the limits of what SPSS can do. Although maybe it's possible using some AGGREGATE or OMS trickery, any ideas on how to do this?
First I recreate your example here to demonstrate on:
data list list/varName CodeValueLabel (2a30).
begin data
"SEX" "F=Female"
"SEX" "M=Male"
"ICFORM" "1=Yes"
"LIMIT_DETECT" "0=Too low"
"LIMIT_DETECT" "1=Normal"
"LIMIT_DETECT" "2=Too high"
"LIMIT_DETECT" "9=Not applicable"
end data.
Now to work:
* sorting to make sure all labels are bunched together.
sort cases by varName CodeValueLabel.
string combineall (a300).
* adding ";" .
compute combineall=concat(rtrim(CodeValueLabel), ";").
* if this is the same varname as last row, attach the two together.
if $casenum>1 and varName=lag(varName)
combineall=concat(rtrim(lag(combineall)), " ", rtrim(combineall)).
exe.
*now to select only relevant lines - first I identify them.
match files /file=* /last=selectthis /by varName.
*now we can delete the rest.
select if selectthis=1.
exe.
NOTE: make combineall wide enough to contain all the values of your most populated variable.

SQL parameter table

I suspect this question is already well-answered but perhaps due to limited SQL vocabulary I have not managed to find what I need. I have a database with many code:description mappings in a single 'parameter' table. I would like to define a query or procedure to return the descriptions for all (or an arbitrary list of) coded values in a given 'content' table with their descriptions from the parameter table. I don't want to alter the original data, I just want to display friendly results.
Is there a standard way to do this?
Can it be accomplished with SELECT or are other statements required?
Here is a sample query for a single coded field:
SELECT TOP (5)
newid() as id,
B.BRIDGE_STATUS,
P.SHORTDESC
FROM
BRIDGE B
LEFT JOIN PARAMTRS P ON P.TABLE_NAME = 'BRIDGE'
AND P.FIELD_NAME = 'BRIDGE_STATUS'
AND P.PARMVALUE = B.BRIDGE_STATUS
ORDER BY
id
I want to produce 'decoded' results like:
| id | BRIDGE_STATUS |
|--------------------------------------|------------ |
| BABCEC1E-5FE2-46FA-9763-000131F2F688 | Active |
| 758F5201-4742-43C6-8550-000571875265 | Active |
| 5E51634C-4DD9-4B0A-BBF5-00087DF71C8B | Active |
| 0A4EA521-DE70-4D04-93B8-000CD12B7F55 | Inactive |
| 815C6C66-8995-4893-9A1B-000F00F839A4 | Proposed |
Rather than original, coded data like:
| id | BRIDGE_STATUS |
|--------------------------------------|---------------|
| F50214D7-F726-4996-9C0C-00021BD681A4 | 3 |
| 4F173E40-54DC-495E-9B84-000B446F09C3 | 3 |
| F9C216CD-0453-434B-AFA0-000C39EFA0FB | 3 |
| 5D09554E-201D-4208-A786-000C537759A1 | 1 |
| F0BDB9A4-E796-4786-8781-000FC60E200C | 4 |
but for an arbitrary number of columns.

Cross tab with a list of values instead of summation

I want a Cross tab that lists field values and counts them instead of just giving a count for the summation. I know I could make this with groups but I cant list the values vertically that way. From my research I believe I have to use a Display String Formula.
SQL Field Data
-------------------------------------------------
| Play # | Formation |Back Set | R/P | PLAY |
-------------------------------------------------
| 1 | TREY | FG | R | TRUCK |
-------------------------------------------------
| 2 | T | FG | R | RHINO |
-------------------------------------------------
| 3 | D | FG | P | 5 STEP |
-------------------------------------------------
| 4 | D | FG | P | 5 STEP |
-------------------------------------------------
| 5 | K JET | NG | R | DOG |
-------------------------------------------------
Desired report structure:
-----------------------------------------------------------
| Backet & Formation | Run | Pass |
-----------------------------------------------------------
| NG K JET | BULLA 1 | |
| | HELL 3 | |
-----------------------------------------------------------
| FG D | | 5 STEP 2 |
-----------------------------------------------------------
| NG K JET | DOG | |
-----------------------------------------------------------
| FG T | RHINO | |
-----------------------------------------------------------
Don't see why a Crosstab is necessary for this - especially if the entire body of the report is just that table.
Group your records by Bracket and Formation - If that's not
something natively configured in your table, make a new Formula field
and group on that.
Drop the 3 relevant fields into whichever section you need to display. (It might be a Footer, based on whether or not you want repeats
Write a formula to determine whether or not Run or Pass are displayed, and place it in their suppression field. (Good luck getting a Crosstab to do that for you! It tends to prefer 0s over blanks.)
If there's more to the report than just this table, you can cheat the system by placing your "table" into a subreport. And of course you can stretch Line objects across the sections and it will stretch to form the table outlines

Tableau to create single chart from multiple parameters

I have tableau workbook online
Before, I had filter for single Principal, and applied to all CUSIPs, and I was able to plot all the inflation-adjusted principals based on Index ratios for a particular date, (refer tab Inflation-Adjusted Trend) i.e.
Now, I have multiple filters based on multiple Principals, i.e. buy one CUSIP for $1500, buy another for $900, etc (refer tab Infl-Adjusted Trend 2)
These were the columns and rows
But I do not like the format of this graph.
I wish to have all the lines together in one graph, just like the single-principal tab below ..... how to fix this? How to bring all the values into one chart?
You currently have six calculated fields calculating your inflation-adjusted principals, one for each CUSIP. Here's what that table might end up looking like:
+-----------+-------------+-------------+-------------+-------------+-----+
| CUSIP | 912828H45 P | 912828NM8 P | 912828PP9 P | 912828QV5 P | ... |
+-----------+-------------+-------------+-------------+-------------+-----+
| 912828H45 | $100 | NULL | NULL | NULL | ... |
| 912828NM8 | NULL | $455 | NULL | NULL | ... |
| 912828PP9 | NULL | NULL | $132 | NULL | ... |
| 912828QV5 | NULL | NULL | NULL | $553 | ... |
| ... | ... | ... | ... | ... | ... |
+-----------+-------------+-------------+-------------+-------------+-----|
There's definitely a better way. Your fields are set up like this:
IF [Cusip] = "912828H45"
THEN
[912828H45 Principal] * [Index Ratio]
END
Instead of setting up one field per CUSIP, make a single field that calculates that value for each CUSIP.
IF [Cusip] = "912828H45"
THEN
[912828H45 Principal] * [Index Ratio]
ELSEIF [Cusip] = "912828NM8"
THEN
[912828NM8 Principal] * [Index Ratio]
...
END
Now your table looks like this.
+-----------+------------------------------+-----+
| CUSIP | Inflation-Adjusted Principal | ... |
+-----------+------------------------------+-----+
| 912828H45 | $100 | ... |
| 912828NM8 | $455 | ... |
| 912828PP9 | $132 | ... |
| 912828QV5 | $553 | ... |
| ... | ... | ... |
+-----------+------------------------------+-----+
That's a LOT easier to work with. Drag that single field into Rows and color by [Cusip].