postgres coalesce fields, sum and group by - postgresql

maybe someone can help me out with a postgres query.
the table structure looks like this
nummer nachname vorname cash
+-------+----------+----------+------+
2 Bert Brecht 0,758
2 Harry Belafonte 1,568
3 Elvis Presley 0,357
4 Mark Twain 1,555
4 Ella Fitz 0,333
…
How can I coalesce the fields where "nummer" are the same and sum the cash values?
My output should look like this:
2 Bert, Brecht 2,326
Harry, Belafonte
3 Elvis, Presley 0,357
4 Mark, Twain 1,888
Ella, Fitz
I think the part to coalesce should work something like this:
array_to_string(array_agg(nachname|| ', ' ||coalesce(vorname, '')), '<br />') as name,
Thanks for any help,
tony

SELECT
nummer,
string_agg(nachname||CASE WHEN vorname IS NULL THEN '' ELSE ', '||vorname END, E'\n') AS name,
sum(cash) AS total_cash
FROM Table1
GROUP BY nummer;
See this SQLFiddle; note that it doesn't display the newline characters between names, but they're still there.
The CASE statement is used instead of coalesce so you don't have a trailing comma on entries with a last name but no first name. If you want a trailing comma, use format('%s, %s',vorname,nachname) instead and avoid all that ugly string concatenation business:
SELECT
nummer, string_agg(format('%s, %s', nachname, vorname), E'\n'),
sum(cash) AS total_cash
FROM Table1
GROUP BY nummer;
If string_agg doesn't work, get a newer PostgreSQL, or mention the version in your questions so it's clear you're using an obsolete version. The query is trivially rewritten to use array_to_string and array_agg anyway.
If you're asking how to sum numbers that're actually represented as text strings like 1,2345 in the database: don't do that. Fix your schema. Format numbers on input and output instead, store them as numeric, float8, integer, ... whatever the appropriate numeric type for the job is.

Related

In DB2 SQL, is it possible to set a variable in the SELECT statement to use multiple times..?

In DB2 SQL, is it possible to SET a variable with the contents of a returned field in the SELECT statement, to use multiple times for calculated fields and criteria further along in the same SELECT statement?
The purpose is to shrink and streamline the code, by doing a calculation once at the beginning and using it multiple times later on...including the HAVING, WHERE, and ORDER BY.
To be honest, I'm not sure this is possible in any version of SQL, much less DB2.
This is on an IBM iSeries 8202 with DB2 SQL v6, which unfortunately is not a candidate for upgrade at this time. This is a very old & messy database, which I have no control over. I must regularly include "cleanup functions" in my SQL.
To to clarify the question, note the following pseudocode. Actual working code follows further below.
DECLARE smnum INTEGER --Not sure if this is correct.
SELECT
-- This is where I'm not sure what to do.
SET CAST((CASE WHEN %smnum%='' THEN '0' ELSE %smnum% END) AS INTEGER) INTO smnum,
%smnum% AS sm,
invdat,
invno,
daqty,
dapric,
dacost,
(dapric-dacost)*daqty AS profit
FROM
saleshistory
WHERE
%smNum% = 30
ORDER BY
%smnum%
Below is my actual working SQL. When adjusted for 2017 or 2016, it can return >10K rows, depending on the salesperson. The complete table has >22M rows.
That buttload of CASE((CAST... function is what I wish to replace with a variable. This is not the only example of this. If I can make it work, I have many other queries that could benefit from the technique.
SELECT
CAST((CASE WHEN TRIM(DASM#)='' THEN '0' ELSE TRIM(DASM#) END) AS INTEGER) AS DASM,
DAIDAT,
DAINV# AS DAINV,
DALIN# AS DALIN,
CAST(TRIM(DAITEM) AS INTEGER) AS DAITEM,
TRIM(DABSW) AS DABSW,
TRIM(DAPCLS) AS DAPCLS,
DAQTY,
DAPRIC,
DAICOS,
DADPAL,
(DAPRIC-DAICOS+DADPAL)*DAQTY AS PROFIT
FROM
VIPDTAB.DAILYV
WHERE
CAST((CASE WHEN TRIM(DASM#)='' THEN '0' ELSE TRIM(DASM#) END) AS INTEGER)=30 AND
TRIM(DABSW)='B' AND
DAIDAT BETWEEN (YEAR(CURDATE())*10000) AND (((YEAR(CURDATE())+1)*10000)-1) AND
CAST(TRIM(DACOMP) AS INTEGER)=1
ORDER BY
CAST((CASE WHEN TRIM(DASM#)='' THEN '0' ELSE TRIM(DASM#) END) AS INTEGER),
DAIDAT,
DAINV#,
DALIN#
Just use a subquery or CTE. I can't figure out the actual logic you want, but the structure looks like this:
select . . .
from (select d.*,
(CASE . . . END) as calc_field
from VIPDTAB.DAILYV d
) d
No variable declaration is needed.
Here is what your SQL would look like with the sub-query that Gordon suggested:
SELECT
DASM,
DAIDAT,
DAINV# AS DAINV,
DALIN# AS DALIN,
CAST(DAITEM AS INTEGER) AS DAITEM,
TRIM(DABSW) AS DABSW,
TRIM(DAPCLS) AS DAPCLS,
DAQTY,
DAPRIC,
DAICOS,
DADPAL,
(DAPRIC-DAICOS+DADPAL)*DAQTY AS PROFIT
FROM
(SELECT
D.*,
CAST((CASE WHEN D.DASM#='' THEN '0' ELSE D.DASM# END) AS INTEGER) AS DASM
FROM VIPDTAB.DAILYV D
) D
WHERE
DASM=30 AND
TRIM(DABSW)='B' AND
DAIDAT BETWEEN (YEAR(CURDATE())*10000) AND (((YEAR(CURDATE())+1)*10000)-1) AND
CAST(DACOMP AS INTEGER)=1
ORDER BY
DASM,
DAIDAT,
DAINV#,
DALIN#
Notice that I removed a lot of the trim() functions, and you could likely remove the rest. The way IBM resolves the Varchar vs. Char comparison thing is by ignoring trailing blanks. So trim(anything) = '' is the same as anything = ''. And since cast(' 123 ' as integer) = 123, I have removed trims from within the cast functions as well. In addition trim(dabsw) = 'B' is the same as dabsw = 'B' as long as the 'B' is the first character in dabsw. So you could even remove that trim if all you are concerned with is trailing blanks.
Here are some additional notes based on comments. The above paragraph is not talking about auto-trim. Fixed length fields will always return as fixed length fields, the trailing blanks will remain. But in comparisons and expressions where trailing blanks are unimportant, or even a hindrance, they are ignored. In expressions where trailing blanks are important, like concatenation, the trailing blanks are not ignored. Another thing, trim() removes both leading and trailing blanks. If you are using trim() to read a fixed length character field into a Varchar, then rtrim() is likely the better choice as it only removes the trailing blanks.
Also, I didn't go through your fields to make sure I got everything you need, I just used * in the sub-query. For performance, it would be best to only return the fields you need. So if you replace D.* with an actual field list, you can remove the correlation name in the from clause of the sub-query. But, the sub-query itself still needs a correlation clause.
My verification was done using IBM i v7.1.
You can encapsalate the case statement in a view. I even have the fancy profit calc in there for you to order by profit. Now the biggest issue you have is the CCSID on the view for calculated columns but that's another question.
create or replace view VIPDTAB.DAILYVQ as
SELECT
CAST((CASE WHEN TRIM(DASM#)='' THEN '0' ELSE TRIM(DASM#) END) AS INTEGER) AS DASM,
DAIDAT,
DAINV# AS DAINV,
DALIN# AS DALIN,
CAST(TRIM(DAITEM) AS INTEGER) AS DAITEM,
TRIM(DABSW) AS DABSW,
TRIM(DAPCLS) AS DAPCLS,
DAQTY,
DAPRIC,
DAICOS,
DADPAL,
(DAPRIC-DAICOS+DADPAL)*DAQTY AS PROFIT
FROM
VIPDTAB.DAILYV
now you can
select dasm, count(*) from vipdtab.dailyvq where dasm = 0 group by dasm order by dasm
or
select * from vipdtab.dailyvq order by profit desc

SQL: Split post code value to return 'outer' part of code

I've got a results set of UK postcodes. Some are formatted with spaces, and some are not e.g. S14HG and S1 4HG
I want my select query to just return the outer part of the post code value in the results, i.e. 'S1'
I can do this in Excel using the following formula:
=IF(ISERROR(LEFT(A1,LEN(A1)-3)),””,LEFT(A1,LEN(A1)-3))
Is it possible to perform the same function in SQL through a SELECT query?
UK postcode can have one of many formats for their outward code.
However, as you can see from the possible formats in that link, there is a consistent format for the remainder of the postcode. If you are confident your postcodes are correct, you can simply remove any spaces and the last 3 characters:
declare #Postcodes table (Postcode nvarchar(10));
insert into #Postcodes values
('S1 4HG')
,('S14HG')
,('S10 4HG')
,('S104HG');
select Postcode
,replace(left(Postcode,len(Postcode)-3),' ','') as OutwardCode
from #Postcodes
Output:
Postcode OutwardCode
S1 4HG S1
S14HG S1
S10 4HG S10
S104HG S10
You can use LEFT() regardless of spaces since you only want the first two, which won't have a space.
SELECT LEFT('S1 4HG',2)
Or just get rid of the spaces...
declare #t varchar(64) = 'S 1 4 H G'
SELECT LEFT(REPLACE(#t,' ',''),2)

PostgreSQL convert a string with commas into an integer

I want to convert a column of type "character varying" that has integers with commas to a regular integer column.
I want to support numbers from '1' to '10,000,000'.
I've tried to use: to_number(fieldname, '999G999G999'), but it only works if the format matches the exact length of the string.
Is there a way to do this that supports from '1' to '10,000,000'?
select replace(fieldname,',','')::numeric ;
To do it the way you originally attempted, which is not advised:
select to_number( fieldname,
regexp_replace( replace(fieldname,',','G') , '[0-9]' ,'9','g')
);
The inner replace changes commas to G. The outer replace changes numbers to 9. This does not factor in decimal or negative numbers.
You can just strip out the commas with the REPLACE() function:
CREATE TABLE Foo
(
Test NUMERIC
);
insert into Foo VALUES (REPLACE('1,234,567', ',', '')::numeric);
select * from Foo; -- Will show 1234567
You can replace the commas by an empty string as suggested, or you could use to_number with the FM prefix, so the query would look like this:
SELECT to_number(my_column, 'FM99G999G999')
There are things to take note:
When using function REPLACE("fieldName", ',', '') on a table, if there are VIEW using the TABLE, that function will not work properly. You must drop the view to use it.

handle escape sequence char in DB2

I want to search a column and get values where value containts \ .
I tried select * from "Values" where "ValueName" like '\'. But returns no value.
Also tried like "\" and like'\''%' etc. But no results.
See the DB2 Documentation on the LIKE predicate, in particular the parts about escape expressions.
What you want is
select * from Values where ValueName like '\\%' escape '\'
To give an example of usage:
create table backslash_escape_test
(
backslash_escape_test_column varchar(20)
);
insert into backslash_escape_test(backslash_escape_test_column)
values ('foo\');
insert into backslash_escape_test(backslash_escape_test_column)
values ('no slashes here');
insert into backslash_escape_test(backslash_escape_test_column)
values ('foo\bar');
insert into backslash_escape_test(backslash_escape_test_column)
values ('\bar');
select count(*) from backslash_escape_test where
backslash_escape_test_column like '%\\%' escape '\';
returns 3 (all 3 rows with \ in them).
select count(*) from backslash_escape_test where
backslash_escape_test_column like '\\%' escape '\';
returns 1 (the \bar row).
select * from Values where ValueName like '%\\%'
values is a not so good name because it may be confused with the values keyword
Don't escape it. You just need wildcards around it like this:
select count(*)
from escape_test
where test_column like '%\%'
But, suppose you really do need to escape the slash. Here's a simpler, more straightforward answer:
The escape-expression allows you to specify whatever character for escaping that you wish. So why use a character that you're looking for, thus requiring you to escape it? Use any other character instead. I'll use a plus sign as an example, but it could be a backslash, pound-sign, question-mark, anything other than a character you are looking for or one of the wildcard characters (% or _).
select count(*)
from escape_test
where test_column like '%\%' escape '+';
Now you don't have to add anything into your like-pattern.
To hold myself to the same standard of proof that #Michael demonstrated --
create table escape_test
( test_column varchar(20) );
insert into escape_test
(test_column)
values ('foo\'),
('no slashes here'),
('foo\bar'),
('\bar');
select 'test1' trial, count(*) result
from escape_test
where test_column like '%\%'
UNION
select 'test2', count(*)
from escape_test
where test_column like '%\\%' escape '\'
UNION
select 'test3', count(*)
from escape_test
where test_column like '%\%' escape '+'
;
Which returns the same number of rows for each method:
TRIAL RESULT
----- ------
test1 3
test2 3
test3 3

A Query to return Alphanumeric

Does any have any query that returns apha numerice values only
Sample
Select FirstName,Surname,NationalID From Contacts
Results
FirstName|Surname|NationalID
Tony |Smith |934&#fdsaf$34£51
Mary |Jones |655^!ffdat#389£2
Expected results
FirstName|Surname|NationalID
Tony |Smith |934fdsaf34£51
Mary |Jones |655ffdat389£2
In other words i want the query to return numbers and text only :. A-Z and 0-9 only remving '$%^&*(~>
You could try the patindex function. For example with just selecting the FirstName this will remove the first occurrence of non alphanumeric:
SELECT replace(FirstName, substring(FirstName, patindex('%[^a-zA-Z0-9]%', FirstName), 1), '') FROM CONTACTS
To expand this to removing all occurrences, move the patindex call into a function as mentioned here:
CREATE FUNCTION CleanVarchar(#Temp VARCHAR(1000))
RETURNS VARCHAR(1000)
AS
BEGIN
WHILE PATINDEX('%[^a-zA-Z0-9]%', #Temp) > 0
SET #Temp = STUFF(#Temp, PATINDEX('%[^a-z^0-9]%', #Temp), 1, '')
RETURN #TEmp
END
Finally call the function
Select CleanVarchar(FirstName),CleanVarchar(Surname),CleanVarchar(NationalID) From Contacts
I don't think this is possible with T-SQL only in a neat way.
But you could easily expose such a functionality through a .NET Assembly to the SQL-Server.
There several examples on this topic over the net.
use Replace function of sql server
Sql Server Tips – Removing or Replacing non-alphanumeric characters in strings