Prevent trailing spaces during insert? - postgresql

I have this INSERT statement and there seems to be trailing spaces at the end of the acct_desc fields. I'd like to know how to prevent trailing spaces from occurring during my insert statement.
INSERT INTO dwh.attribution_summary
SELECT d.adic,
d.ucic,
b.acct_type_desc as acct_desc,
a.begin_mo_balance as opening_balance,
c.date,
'fic' as userid
FROM fic.dim_members d
JOIN fic.fact_deposits a ON d.ucic = a.ucic
JOIN fic.dim_date c ON a.date_id = c.date_id
JOIN fic.dim_acct_type b ON a.acct_type_id = b.acct_type_id
WHERE c.date::timestamp = current_date - INTERVAL '1 days';

Use the PostgreSQL trim() function. There is trim(), rtrim() and ltrim().
To trim trailing spaces:
...
rtrim(b.acct_type_desc) as acct_desc,
...
If acct_type_desc is not of type text or varchar, cast it to text first:
...
rtrim(b.acct_type_desc::text) as acct_desc,
...
If acct_type_desc is of type char(n), casting it to text removes trailing spaces automatically, no trim() necessary.

Besides what others have said, add a CHECK CONSTRAINT to that column, so if one forgets to pass the rtrim() function inside the INSERT statement, the check constraint won't.
For example, check trailing spaces (in the end) of string:
ALTER TABLE dwh.attribution_summary
ADD CONSTRAINT tcc_attribution_summary_trim
CHECK (rtrim(acct_type_desc) = acct_type_desc);
Another example, check for leading and trailing spaces, and consecutive white spaces in string middle):
ALTER TABLE dwh.attribution_summary
ADD CONSTRAINT tcc_attribution_summary_whitespace
CHECK (btrim(regexp_replace(acct_type_desc, '\s+'::text, ' '::text, 'g'::text)) = acct_type_desc);

What is the type of acct_desc?
If it is CHAR(n), then the DBMS has no choice but to add spaces at the end; the SQL Standard requires that.
If it is VARCHAR(n), then the DBMS won't add spaces at the end.
If PostgresSQL supported them, the national variants of the types (NCHAR, NVARCHAR) would behave the same as the corresponding non-national variant does.

Related

How to select a column to appear with two single quote in the field

Here is my postgresql query
select 'insert into employee(ID_NUMBER,NAME,OFFICE) values ('''||ID_NUMBER||''','''||NAME||''','''||replace(DESIGNATION,'&','and')||''','''||replace(DEPT_NAME,'&','and')||''')' as col
from icare_employee_view
where id_number='201403241'
order by name;
output
insert into employee(ID_NUMBER,NAME,OFFICE) values ('201403241','ABINUMAN, JOSEPHINE CALLO','Assistant AGrS Principal for Curriculum and Instruction','AGrS Principal's Office')
but I need 'AGrS Principal's Office' to be 'AGrS Principal''s Office'
but I need 'AGrS Principal's Office' to be 'AGrS Principal''s Office'
any suggestions or sol'n is highly appreciated on how to fix my PostgreSQL query
Hi check this from pgDocs:
quote_literal ( text ) → text
Returns the given string suitably quoted to be used as a string
literal in an SQL statement string. Embedded single-quotes and
backslashes are properly doubled. Note that quote_literal returns null
on null input; if the argument might be null, quote_nullable is often
more suitable. See also Example 43.1.
quote_literal(E'O'Reilly') → 'O''Reilly'

In DB2 SQL, is it possible to set a variable in the SELECT statement to use multiple times..?

In DB2 SQL, is it possible to SET a variable with the contents of a returned field in the SELECT statement, to use multiple times for calculated fields and criteria further along in the same SELECT statement?
The purpose is to shrink and streamline the code, by doing a calculation once at the beginning and using it multiple times later on...including the HAVING, WHERE, and ORDER BY.
To be honest, I'm not sure this is possible in any version of SQL, much less DB2.
This is on an IBM iSeries 8202 with DB2 SQL v6, which unfortunately is not a candidate for upgrade at this time. This is a very old & messy database, which I have no control over. I must regularly include "cleanup functions" in my SQL.
To to clarify the question, note the following pseudocode. Actual working code follows further below.
DECLARE smnum INTEGER --Not sure if this is correct.
SELECT
-- This is where I'm not sure what to do.
SET CAST((CASE WHEN %smnum%='' THEN '0' ELSE %smnum% END) AS INTEGER) INTO smnum,
%smnum% AS sm,
invdat,
invno,
daqty,
dapric,
dacost,
(dapric-dacost)*daqty AS profit
FROM
saleshistory
WHERE
%smNum% = 30
ORDER BY
%smnum%
Below is my actual working SQL. When adjusted for 2017 or 2016, it can return >10K rows, depending on the salesperson. The complete table has >22M rows.
That buttload of CASE((CAST... function is what I wish to replace with a variable. This is not the only example of this. If I can make it work, I have many other queries that could benefit from the technique.
SELECT
CAST((CASE WHEN TRIM(DASM#)='' THEN '0' ELSE TRIM(DASM#) END) AS INTEGER) AS DASM,
DAIDAT,
DAINV# AS DAINV,
DALIN# AS DALIN,
CAST(TRIM(DAITEM) AS INTEGER) AS DAITEM,
TRIM(DABSW) AS DABSW,
TRIM(DAPCLS) AS DAPCLS,
DAQTY,
DAPRIC,
DAICOS,
DADPAL,
(DAPRIC-DAICOS+DADPAL)*DAQTY AS PROFIT
FROM
VIPDTAB.DAILYV
WHERE
CAST((CASE WHEN TRIM(DASM#)='' THEN '0' ELSE TRIM(DASM#) END) AS INTEGER)=30 AND
TRIM(DABSW)='B' AND
DAIDAT BETWEEN (YEAR(CURDATE())*10000) AND (((YEAR(CURDATE())+1)*10000)-1) AND
CAST(TRIM(DACOMP) AS INTEGER)=1
ORDER BY
CAST((CASE WHEN TRIM(DASM#)='' THEN '0' ELSE TRIM(DASM#) END) AS INTEGER),
DAIDAT,
DAINV#,
DALIN#
Just use a subquery or CTE. I can't figure out the actual logic you want, but the structure looks like this:
select . . .
from (select d.*,
(CASE . . . END) as calc_field
from VIPDTAB.DAILYV d
) d
No variable declaration is needed.
Here is what your SQL would look like with the sub-query that Gordon suggested:
SELECT
DASM,
DAIDAT,
DAINV# AS DAINV,
DALIN# AS DALIN,
CAST(DAITEM AS INTEGER) AS DAITEM,
TRIM(DABSW) AS DABSW,
TRIM(DAPCLS) AS DAPCLS,
DAQTY,
DAPRIC,
DAICOS,
DADPAL,
(DAPRIC-DAICOS+DADPAL)*DAQTY AS PROFIT
FROM
(SELECT
D.*,
CAST((CASE WHEN D.DASM#='' THEN '0' ELSE D.DASM# END) AS INTEGER) AS DASM
FROM VIPDTAB.DAILYV D
) D
WHERE
DASM=30 AND
TRIM(DABSW)='B' AND
DAIDAT BETWEEN (YEAR(CURDATE())*10000) AND (((YEAR(CURDATE())+1)*10000)-1) AND
CAST(DACOMP AS INTEGER)=1
ORDER BY
DASM,
DAIDAT,
DAINV#,
DALIN#
Notice that I removed a lot of the trim() functions, and you could likely remove the rest. The way IBM resolves the Varchar vs. Char comparison thing is by ignoring trailing blanks. So trim(anything) = '' is the same as anything = ''. And since cast(' 123 ' as integer) = 123, I have removed trims from within the cast functions as well. In addition trim(dabsw) = 'B' is the same as dabsw = 'B' as long as the 'B' is the first character in dabsw. So you could even remove that trim if all you are concerned with is trailing blanks.
Here are some additional notes based on comments. The above paragraph is not talking about auto-trim. Fixed length fields will always return as fixed length fields, the trailing blanks will remain. But in comparisons and expressions where trailing blanks are unimportant, or even a hindrance, they are ignored. In expressions where trailing blanks are important, like concatenation, the trailing blanks are not ignored. Another thing, trim() removes both leading and trailing blanks. If you are using trim() to read a fixed length character field into a Varchar, then rtrim() is likely the better choice as it only removes the trailing blanks.
Also, I didn't go through your fields to make sure I got everything you need, I just used * in the sub-query. For performance, it would be best to only return the fields you need. So if you replace D.* with an actual field list, you can remove the correlation name in the from clause of the sub-query. But, the sub-query itself still needs a correlation clause.
My verification was done using IBM i v7.1.
You can encapsalate the case statement in a view. I even have the fancy profit calc in there for you to order by profit. Now the biggest issue you have is the CCSID on the view for calculated columns but that's another question.
create or replace view VIPDTAB.DAILYVQ as
SELECT
CAST((CASE WHEN TRIM(DASM#)='' THEN '0' ELSE TRIM(DASM#) END) AS INTEGER) AS DASM,
DAIDAT,
DAINV# AS DAINV,
DALIN# AS DALIN,
CAST(TRIM(DAITEM) AS INTEGER) AS DAITEM,
TRIM(DABSW) AS DABSW,
TRIM(DAPCLS) AS DAPCLS,
DAQTY,
DAPRIC,
DAICOS,
DADPAL,
(DAPRIC-DAICOS+DADPAL)*DAQTY AS PROFIT
FROM
VIPDTAB.DAILYV
now you can
select dasm, count(*) from vipdtab.dailyvq where dasm = 0 group by dasm order by dasm
or
select * from vipdtab.dailyvq order by profit desc

Set numeric column to equal formatted varchar currency column in PostgreSQL

I have a VARCHAR(1000) column of prices with dollar signs (e.g. $100) and I have created a new NUMERIC(15,2) column, which I'd like to set equal to the prices in the VARCHAR column.
This is what worked for me in MySQL:
UPDATE product_table
SET cost = REPLACE(REPLACE(price, '$', ''), ',','');
but in PostgreSQL it throws an error:
ERROR: column "cost" is of type numeric but expression is of type character
LINE 2: SET cost = REPLACE(REPLACE(price, '$', ''), ',','');
^
HINT: You will need to rewrite or cast the expression.
I tried to follow the hint and tried some Google searches for examples, but my small brain hasn't been able to figure it out.
In PostgreSQL you can do this in one swoop, rather than replacing '$' and ',' s in separate calls:
UPDATE product_table
SET cost = regexp_replace(price, '[$,]', '', 'g')::numeric(15,2);
In regexp_replace the pattern [$,] means to replace either of '$' or ',' with the replace string (the empty string '' in this case), and the 'g' flag indicates that all such patterns need to be replaced.
Then you need to cast the resulting string to a numeric(15,2) value.
Simply cast the result of REPLACE with cast .. as numeric.
Try this:
UPDATE product_table
SET cost = CAST(REPLACE(REPLACE(price, '$', ''), ',','') AS NUMERIC);
I wouldn't suggest having this table structure though, because it can lead to anomalies (cost value doesn't reflect the price value).

PostgreSQL convert a string with commas into an integer

I want to convert a column of type "character varying" that has integers with commas to a regular integer column.
I want to support numbers from '1' to '10,000,000'.
I've tried to use: to_number(fieldname, '999G999G999'), but it only works if the format matches the exact length of the string.
Is there a way to do this that supports from '1' to '10,000,000'?
select replace(fieldname,',','')::numeric ;
To do it the way you originally attempted, which is not advised:
select to_number( fieldname,
regexp_replace( replace(fieldname,',','G') , '[0-9]' ,'9','g')
);
The inner replace changes commas to G. The outer replace changes numbers to 9. This does not factor in decimal or negative numbers.
You can just strip out the commas with the REPLACE() function:
CREATE TABLE Foo
(
Test NUMERIC
);
insert into Foo VALUES (REPLACE('1,234,567', ',', '')::numeric);
select * from Foo; -- Will show 1234567
You can replace the commas by an empty string as suggested, or you could use to_number with the FM prefix, so the query would look like this:
SELECT to_number(my_column, 'FM99G999G999')
There are things to take note:
When using function REPLACE("fieldName", ',', '') on a table, if there are VIEW using the TABLE, that function will not work properly. You must drop the view to use it.

Why does t-sql's LEN() function ignore the trailing spaces?

Description of LEN() function on MSDN : Returns the number of characters of the specified string expression, excluding trailing blanks.
Why was the LEN() function designed to work this way? What problem does this behaviour solve?
Related :
LEN function not including trailing spaces in SQL Server
charindex() counts whiteshars in the end, len() doesn't in T-SQL
It fixes the automatic padding of strings due to the data type length. Consider the following:
DECLARE #Test CHAR(10), #Test2 CHAR(10)
SET #Test = 'test'
SET #Test2 = 'Test2'
SELECT LEN(#Test), LEN(#Test + '_') - 1, LEN(#Test2), LEN(#Test2 + '_') - 1
This will return 4, 10, 5 and 10 respectively. Even though no trailing spaces were used for #Test, it still maintains its length of 10. If LEN did not trim the trailing spaces then LEN(#test) and LEN(#Test2) would be the same. More often than not people will want to know the length of the meaningful data, and not the length of the automatic padding so LEN removes trailing blanks. There are workarounds/alternatives where this is not the required behaviour.
space padding in non variable fiels like
CREATE TABLE tblSomething ( variable_length varchar(20), non_var_lenght char(10) );