I have a table with a column. The column holds information on customer locations using a grid system. So, our HQ is at 0,0 and our customers approx location in miles from the HQ is stored the same way e.g. -1,9 etc. I am using the data type Varchar to store the grids.
I have been trying to write a query to extract the number of locations less than 5 miles from our HQ. But I seem to keep getting rather arbitrary results. Do I need to convert the grids to an int and remove the comma somehow? Or am I doing something else fundamentally wrong.
Schema issues aside, we can make this work, though performance might suffer. The examples I've provided makes an assumption from your example that the coordinates are always stored as two integers separated by a comma(,).
The first example leverages PARSENAME to split the coordinates for comparison. PARSENAME isn't really needed for just a question of "within 5 miles", but if you wanted more granularity of direction, then PARSENAME at least gives you certainty of your ordinals.
In the second example I used STRING_SPLIT to get the same results.
(one thing to note, you said "locations less than 5 miles" and between -5 and 5 will include locations AT 5 miles as well.
/* Build out the table */
CREATE TABLE Client
(
Client VARCHAR(100)
, Client_Location VARCHAR(12)
)
INSERT INTO dbo.Client
(
Client,
Client_Location
)
VALUES
('HQ','0,0')
,('Cust1','-1,9')
,('Cust2','7,11')
,('Cust3','-5,5')
,('Cust4','4,1')
,('Cust5','5,6')
,('Cust6','6,5')
/* Ex 1 */
SELECT
*
FROM dbo.Client
WHERE
CAST(PARSENAME(REPLACE(Client_Location,',','.'),2) AS int) BETWEEN -5 AND 5
AND CAST(PARSENAME(REPLACE(Client_Location,',','.'),1) AS int) BETWEEN -5 AND 5
/* Ex 2 */
SELECT
Client
, Client_Location
FROM dbo.Client
CROSS APPLY STRING_SPLIT(Client_Location,',')
WHERE
value BETWEEN -5 AND 5
GROUP BY Client, Client_Location
HAVING COUNT(client) = 2
Related
I am wondering if any of you can help me on my problem.
I have a table containing money exchanges between individuals. Thus, the table is composed of columns ID A and ID B, which are unique IDs, and another column with an integer, a price.
My problem is that I want to perform the sum of the integer for a precise individual and I can find the same individual either in column ID A or ID B because the software is putting IDs in random columns. Therefore, I have 2 dimensions ID A and ID B.
I have some experience in Tableau but I am in a dead end on this one.
Do you have any idea ?
Thanks a lot !
Julien
If you only need to sum one individual at a time, use a parameter for the IDs.
Something like the following:
sum(IF [PARAMETER_ID] = [ID_A] THEN [PRICE] END)
+
sum(IF [PARAMETER_ID] = [ID_B] THEN [PRICE] END)
Matt got the answer. Make a custom SQL request to fuse the 2 ID columns. In the end you have the double of columns but hey that's what I wanted ;)
Also, it seems to be the most reasonable way to solve this.
I am trying to pivot using crosstab function and unable to achieve for the requirement. Is there is a way to perform crosstab dynamically and also dynamic result set?
I have tried using crosstab built-in function and unable to meet my requirement.
select * from crosstab ('select item,cd, type, parts, part, cnt
from item
order by 1,2')
AS results (item text,cd text, SUM NUMERIC, AVG NUMERIC);
Sample Data:
ITEM CD TYPE PARTS PART CNT
Item 1 A AVG 4 1 10
Item 1 B AVG 4 2 20
Item 1 C AVG 4 3 30
Item 1 D AVG 4 4 40
Item 1 A SUM 4 1 10
Item 1 B SUM 4 2 20
Item 1 C SUM 4 3 30
Item 1 D SUM 4 4 40
Expected Results:
ITEM CD PARTS TYPE_1 CNT_1 TYPE_1 CNT_1 TYPE_2 CNT_2 TYPE_2 CNT_2 TYPE_3 CNT_3 TYPE_3 CNT_3 TYPE_4 CNT_4 TYPE_4 CNT_4
Item 1 A 4 AVG 10 SUM 10 AVG 20 SUM 20 AVG 30 SUM 30 AVG 40 SUM 40
The PARTS value is based on a parameter passed by the user. If the user passes 2 for example, there will be 4 rows in the result set (2 parts for AVG and 2 parts of SUM).
Can I achieve this requirement using CROSSTAB function or is there a custom SQL statement that need to be developed?
I'm not following your data, so I can't offer examples based on it. But I have been looking at pivot/cross-tab features over the past few days. I was just looking at dynamic cross tabs just before seeing your post. I'm hoping that your question gets some good answers, I'll start off with a bit of background.
You can use the crosstab extension for standard cross tabs, what when wrong when you tried it? Here's an example I wrote for myself the other day with a bunch of comments and aliases for clarity. The pivot is looking at item scans to see where the scans were "to", like the warehouse or the floor.
/* Basic cross-tab example for crosstab (text) format of pivot command.
Notice that the embedded query has to return three columns, see the aliases.
#1 is the row label, it shows up in the output.
#2 is the category, what determines how many columns there are. *You have to work this out in advance to declare them in the return.*
#3 is the cell data, what goes in the cross tabs. Note that this form of the crosstab command may return NULL, and coalesce does not work.
To get rid of the null count/sums/whatever, you need crosstab (text, text).
*/
select *
from crosstab ('select
specialty_name as row_label,
scanned_to as column_splitter,
count(num_inst)::numeric as cell_data
from scan_table
group by 1,2
order by 1,2')
as scan_pivot (
row_label citext,
"Assembly" numeric,
"Warehouse" numeric,
"Floor" numeric,
"QA" numeric);
As a manual alternative, you can use a series of FILTER statements. Here's an example that summaries errors_log records by day of the week. The "down" is the error name, the "across" (columns) are the days of the week.
select "error_name",
count(*) as "Overall",
count(*) filter (where extract(dow from "updated_dts") = 0) as "Sun",
count(*) filter (where extract(dow from "updated_dts") = 1) as "Mon",
count(*) filter (where extract(dow from "updated_dts") = 2) as "Tue",
count(*) filter (where extract(dow from "updated_dts") = 3) as "Wed",
count(*) filter (where extract(dow from "updated_dts") = 4) as "Thu",
count(*) filter (where extract(dow from "updated_dts") = 5) as "Fri",
count(*) filter (where extract(dow from "updated_dts") = 6) as "Sat"
from error_log
where "error_name" is not null
group by "error_name"
order by 1;
You can do the same thing with CASE, but FILTER is easier to write.
It looks like you want something basic, maybe the FILTER solution appeals? It's easier to read than calls to crosstab(), since that was giving you trouble.
FILTER may be slower than crosstab. Probably. (The crosstab extension is written in C, and I'm not sure how smart FILTER is about reading off indexes.) But I'm not sure as I haven't tested it out yet. (It's on my to do list, but I haven't had time yet.) I'd be super interested if anyone can offer results. We're on 11.4.
I wrote a client-side tool to build FILTER-based pivots over the past few days. You have to supply the down and across fields, an aggregate formula and the tool spits out the SQL. With support for coalesce for folks who don't want NULL, ROLLUP, TABLESAMPLE, view creation, and some other stuff. It was a fun project. Why go to that effort? (Apart from the fun part.) Because I haven't found a way to do dynamic pivots that I actually understand. I love this quote:
"Dynamic crosstab queries in Postgres has been asked many times on SO all involving advanced level functions/types. Consider building your needed query in application layer (Java, Python, PHP, etc.) and pass it in a Postgres connected query call. Recall SQL is a special-purpose, declarative type while app layers are general-purpose, imperative types." – Parfait
So, I wrote a tool to pre-calculate and declare the output columns. But I'm still curious about dynamic options in SQL. If that's of interest to you, have a look at these two items:
https://postgresql.verite.pro/blog/2018/06/19/crosstab-pivot.html
Flatten aggregated key/value pairs from a JSONB field?
Deep magic in both.
Let's say I have the following table playgrounds:
serialnumber length breadth country
1 15 10 Brazil
2 12 11 Chile
3 14 10 Brazil
4 14 10 Brazil
Now, I want to add a column area to the table, that is essentially length*breadth.
Obviously, I can do this update:
UPDATE playground set area = length*breadth where country = 'Brazil'.
Using the above statement, I will have to unnecessarily compute length * breadth twice for serial number 3 and 4. Is there a way to add group by and minimize the amount of calculations?
Something like:
UPDATE playground set area = length*breadth where country = 'Brazil'
group by length, breadth;
The first thing to note is that you should not add the area as a column. Data items that happen to be the result of simple arithmetic operations do not need their own column.
The second point is that you don't need to worry about doing a multiplication operation once each for rows 3 and 4. That's almost zero effort for the server
Third point is that if you are worried about rows 3 and 4, that means they are duplicated, and duplicated data should not be in the database. Consider deleting duplicates as described here: https://wiki.postgresql.org/wiki/Deleting_duplicates
To answer your question:
Is there a way, I could add group by and minimize the amount of calculations?
SELECT DISTINCT ON (1,2,3)
length, breadth, country, length * breadth AS area
FROM playgrounds
ORDER BY 1, 2, 3, serialnumber;
This takes the row with the smallest serialnumber from each set of duplicates. Detailed explanation:
Select first row in each GROUP BY group?
But consider the #e4c5's answer and Pavel's comment first. Don't store functionally dependent values that can be computed on the fly cheaply. Just drop duplicate rows and use a view:
To permanently delete dupes with greater serialnumber:
DELETE FROM playgrounds p
WHERE EXISTS (
SELECT 1
FROM playgrounds
WHERE length = p.length
breadth = p.breadth
country = p.country
AND serialnumber < p.serialnumber
);
Then:
CREATE VIEW playgrounds_plus AS
SELECT *, length * breadth AS area
FROM playgrounds;
Related:
Clean up SQL data before unique constraint
I need to group a table by the sum of a NUMC-column, which unfortunately seems not to be possible with ABAP / OpenSQL.
My code looks like that:
SELECT z~anln1
FROM zzanla AS z
INTO TABLE gt_
GROUP BY z~anln1 z~anln2
HAVING SUM( z~percent ) <> 100 " percent unfortunately is a NUMC -> summing up not possible
What would be the best / easiest practices here as I cannot alter the table itself?
Unfortunately the NUMC type is described as numerical text, so at the end it lands in the database as VARCHAR and that is why the functions like SUM or AVG cannot be used.
It all depends on how big your table is. If it is rather small you could get the group fields and the values for sum into an internal table and then sum it using COLLECT statement and eventually remove the rows for which the sum is equal 100%.
One solution is to define the field in the table using a more appropriate type.
NUMC is often used for key fields - like document numbers, which there would never be a reason to add together.
I didn't find a smooth solution.
What I did, was to copy everything in an internal table, looped over it converting the NUMC values to DEC values. Grouping and summing up worked at that point.
At the end, I converted the DEC values back to NUMC values.
It's been awhile. I came back to this post, because someone voted up my original answer. I was thinking about editing my old answer but I decided to post a new one. As this question was asked in 2017, there were some restictions but now it can be done by using CAST function in new OpenSQL.
SELECT z~anln1
FROM zzanla AS z
INTO TABLE #gt_
GROUP BY z~anln1, z~anln2
HAVING SUM( CAST( z~percent AS INT4 ) ) <> 100
Yes I know, this question has been asked MANY times but after reading all the posts I found that there wasn't an answer that fits my need. So, Heres my question. I would like to take a column of values and pivot them into rows of 6 columns.
I want to take this...... And turn it into this.......................
G Letter Date Code Ammount Name Account
081278 G 081278 12 00123535 John Doe 123456
12
00123535
John Doe
123456
I have 110000 values in this one column in one table called TempTable. I need all the values displayed because each row is an entity to itself. For instance, There is one unique entry for all of the Letter, Date, Code, Ammount, Name, and Account columns. I understand that the aggregate function is required but is there a workaround that will allow me to get this desired result?
Just use a MAX aggregate
If one row = one column (per group of 6 rows) then MAX of a single value = that row value.
However, the data you've posted in insufficient. I don't see anything to:
associate the 6 rows per group
distinguish whether a row is "Letter" or "Name"
There is no implicit row order or number to rely upon to generate the groups
Unfortunately, the max columns in a SQL 2008 select statement is 4,096 as per MSDN Max Capacity.
Instead of using a pivot, you might consider dynamic SQL to get what you want to do.
Declare #SQLColumns nvarchar(max),#SQL nvarchar(max)
select #SQLColumns=(select '''+ColName+'''',' from TableName for XML Path(''))
set #SQLColumns=left(#SQLColumns,len(#SQLColumns)-1)
set #SQL='Select '+#SQLColumns
exec sp_ExecuteSQL #SQL,N''