Tsql: CHARINDEX and REPLACE Dosn't recognize special character (Unicode-8207) - tsql

Given the following variable:
declare #str nvarchar(50) = 'a‏bc'
between 'a' and 'b' there is a hidden character which is: nchar(8207)
therefore:
select len(#str) --4
and:
select unicode(SUBSTRING(#str,2,1)) --8207
my problem is that I have many such records, and I have to find all these characters and delete them.
I'm trying find by CHARINDEX or REPLACE but it just does not recognize this character:
select CHARINDEX(Nchar(unicode(8207)),#str) --0
select REPLACE (#str , Nchar(unicode(8207)), '1') --abc

It seems that REPLACE() does indeed not work.
Looks like you will need to use STUFF()
DECLARE #Moo NVARCHAR(50) = CONCAT('a', NCHAR(8207), 'b', 'c')
SELECT #Moo
,LEN(#Moo)
,LEN(STUFF(#moo, 2, 1, ''))
,STUFF(#moo, 2, 1, '')
However, this leaves you with having to know the locations of the offending unprintable characters. A WHILE loop or Tally table might serve you well here.

Related

Why wont Rtrim clear spaces (for comparison)

I have a a col that is a text field (i know not used any more) that i need to compare. (the instruction field is a text field)
Case when rtrim(cast(RT.INSTRUCTIONS as varchar(max))) = rtrim(cast(HQ.INSTRUCTIONS as varchar(max))) then 'TRUE' Else 'FALSE' end as INSTRUCTIONS.
the value in RT.Instructions is "Check the oil levels every 30 hours. "
the value in HQ.Instructions is "Check the oil levels every 30 hours."
Why wont the trailing blank go away. i did a len on both and hq is 1 less then the rt value.
I also am having the same issue on a varchar(60) field.
Perhaps there is a character that isn't being picked up. Maybe the following will be useful in finding that value. Or maybe just get you started down the right path.
DECLARE #Char INT = 0
DECLARE #Tab TABLE (Id INT, Chr VARCHAR(5), Instructions VARCHAR(MAX), c VARCHAR(MAX))
WHILE #Char < = 256
BEGIN
INSERT INTO #Tab
SELECT Id
,CONVERT(NVARCHAR,CHAR(#Char)) Chr
,CONVERT(NVARCHAR,RIGHT(RTRIM(rt.Instructions),1)) InstructionChar
,CONVERT(NVARCHAR,CHAR(CONVERT(int,#Char))) c
FROM YourTable
WHERE RIGHT(RTRIM(Instructions),1) LIKE '%'+CHAR(CONVERT(int,#Char))
AND RIGHT(RTRIM(Instructions),1) NOT LIKE '[A-Za-z]'
AND RIGHT(RTRIM(Instructions),1) NOT LIKE '[0-9]'
AND RIGHT(RTRIM(Instructions),1) NOT LIKE '.'
AND RIGHT(RTRIM(Instructions),1) NOT LIKE ']'
AND RIGHT(RTRIM(Instructions),1) NOT LIKE ')'
AND RIGHT(RTRIM(Instructions),1) NOT LIKE '"'
AND RIGHT(RTRIM(Instructions),1) NOT LIKE '}'
AND RIGHT(RTRIM(Instructions),1) NOT LIKE '/'
SET #Char = #Char + 1
END
SELECT DISTINCT *
FROM #Tab
Sorry : I'm not allowed to write comments yet
First try :
Case when left(rtrim(cast(RT.INSTRUCTIONS as varchar(max))),len(HQ.INSTRUCTIONS)) = rtrim(cast(HQ.INSTRUCTIONS as varchar(max))) then 'TRUE' Else 'FALSE' end as INSTRUCTIONS
To check if no other issue is concerned.
Then do :
SELECT ASCII(right(RT.INSTRUCTIONS,1))
To confirm that the trailling space is a "real" space : this query should display 32.
CHAR(32) => ' '
ASCII (' ') => 32
I bet you will get 16O. 160 means the last caracter is an non-breaking space witch is not concerned by trim functions...
If so, you will have to build a scalar function like :
ALTER FUNCTION [dbo].[fn_Replace_NonBreakingSpace]
(
#InputString varchar(max),
)
RETURNS varchar(MAX)
AS
BEGIN
RETURN REPLACE(#InputString, char(160), char(32))
END
And then :
Case when rtrim(dbo.fn_Replace_NonBreakingSpace(RT.INSTRUCTIONS)) = rtrim(dbo.fn_Replace_NonBreakingSpace(HQ.INSTRUCTIONS)) then 'TRUE' Else 'FALSE' end as INSTRUCTIONS

replace all alphanumeric characters tsql

I need to replace all alphanumeric characters with in the input with 'x'.
'12 34 - a'
becomes 'xx xx - x'. I tried to use
patindex
with [^a-zA-Z0-9], but after the first replacement still the same alphanumeric is found. looks that patindex works only when removing chars
can someone advice a solution for the issue
try this:
DECLARE #t VARCHAR(max) = '12 34 - a'
DECLARE #Keep VARCHAR(50)
SET #Keep = '%[a-vyz0-9]%'
WHILE PATINDEX(#Keep, #t) >0
Set #t = Stuff(#t, PatIndex(#Keep, #t), 1, 'x')
SELECT #t

Concatinating attributes in PostgreSQL

I am trying to create an aggregate function that concatenates numbers by grouping them. How can I go about it? Let's say I have a table like this below.
Table Numbers
123
145
187
105
I want the outcome to look like
105_123_145_187
I know how to use group_concat separator _ if I am working in MySQL.
How can I do it in PostgreSQL?
There is already such function:
SELECT string_agg(num::text,'_')
FROM Numbers;
Details here: string_agg.
Tell me, if you use postgresql 8.4 or earlier version. I will show you, how to implement this function as custom aggregate.
UPD Custom aggregate:
CREATE OR REPLACE FUNCTION public.concat_delimited (text, text, text)
RETURNS text AS
$body$
SELECT $1 || (CASE WHEN $1 = '' THEN '' ELSE $3 END) || $2;
$body$
LANGUAGE 'sql'
IMMUTABLE
RETURNS NULL ON NULL INPUT;
CREATE AGGREGATE public.text_concat (text, text)
(
SFUNC = public.concat_delimited,
STYPE = text
);
For modern PostgreSQL use string_agg(columnname,'_').
For old versions 8.4 and up, use string_to_array(array_agg(columname), '_')
See the array functions and operators documentation.
Example:
regress=> SELECT array_to_string(array_agg(x::text), ', ') FROM generate_series(1,10) x;
array_to_string
-------------------------------
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
(1 row)
Always include your PostgreSQL version in your questions.
concat_ws(sep text, str "any" [, str "any" [, ...] ]) is the function your looking for.
The first param is your separator, NULL args are ignored. See The PostgreSQL manual for details.
I am not versed in pgSQL at all, but the answer for writing an aggregate function is going to lay there, check out the pgSQL manual for how to write your functions.

Typecast string to integer

I am importing data from a table which has raw feeds in Varchar, I need to import a column in varchar into a string column. I tried using the <column_name>::integer as well as to_number(<column_name>,'9999999') but I am getting errors, as there are a few empty fields, I need to retrieve them as empty or null into the new table.
Wild guess: If your value is an empty string, you can use NULLIF to replace it for a NULL:
SELECT
NULLIF(your_value, '')::int
You can even go one further and restrict on this coalesced field such as, for example:-
SELECT CAST(coalesce(<column>, '0') AS integer) as new_field
from <table>
where CAST(coalesce(<column>, '0') AS integer) >= 10;
If you need to treat empty columns as NULLs, try this:
SELECT CAST(nullif(<column>, '') AS integer);
On the other hand, if you do have NULL values that you need to avoid, try:
SELECT CAST(coalesce(<column>, '0') AS integer);
I do agree, error message would help a lot.
The only way I succeed to not having an error because of NULL, or special characters or empty string is by doing this:
SELECT REGEXP_REPLACE(COALESCE(<column>::character varying, '0'), '[^0-9]*' ,'0')::integer FROM table
I'm not able to comment (too little reputation? I'm pretty new) on Lukas' post.
On my PG setup to_number(NULL) does not work, so my solution would be:
SELECT CASE WHEN column = NULL THEN NULL ELSE column :: Integer END
FROM table
If the value contains non-numeric characters, you can convert the value to an integer as follows:
SELECT CASE WHEN <column>~E'^\\d+$' THEN CAST (<column> AS INTEGER) ELSE 0 END FROM table;
The CASE operator checks the < column>, if it matches the integer pattern, it converts the rate into an integer, otherwise it returns 0
Common issue
Naively type casting any string into an integer like so
SELECT ''::integer
Often results to the famous error:
Query failed: ERROR: invalid input syntax for integer: ""
Problem
PostgreSQL has no pre-defined function for safely type casting any string into an integer.
Solution
Create a user-defined function inspired by PHP's intval() function.
CREATE FUNCTION intval(character varying) RETURNS integer AS $$
SELECT
CASE
WHEN length(btrim(regexp_replace($1, '[^0-9]', '','g')))>0 THEN btrim(regexp_replace($1, '[^0-9]', '','g'))::integer
ELSE 0
END AS intval;
$$
LANGUAGE SQL
IMMUTABLE
RETURNS NULL ON NULL INPUT;
Usage
/* Example 1 */
SELECT intval('9000');
-- output: 9000
/* Example 2 */
SELECT intval('9gag');
-- output: 9
/* Example 3 */
SELECT intval('the quick brown fox jumps over the lazy dog');
-- output: 0
you can use this query
SUM(NULLIF(conversion_units, '')::numeric)
And if your column has decimal points
select NULLIF('105.0', '')::decimal
This works for me:
select (left(regexp_replace(coalesce('<column_name>', '0') || '', '[^0-9]', '', 'g'), 8) || '0')::integer
For easy view:
select (
left(
regexp_replace(
-- if null then '0', and convert to string for regexp
coalesce('<column_name>', '0') || '',
'[^0-9]',
'',
'g'
), -- remove everything except numbers
8 -- ensure ::integer doesn't overload
) || '0' -- ensure not empty string gets to ::integer
)::integer
The perfect solution for me is to use nullif and regexp_replace
SELECT NULLIF(REGEXP_REPLACE('98123162t3712t37', '[^0-9]', '', 'g'), '')::bigint;
Above solution consider the following edge cases.
String and Number: only the regexp_replace function perfectly converts into integers.
SELECT NULLIF(REGEXP_REPLACE('string and 12345', '[^0-9]', '', 'g'), '')::bigint;
Only string: regexp_replace converts non-string characters to empty strings; which can't cast directly to integer so use nullif to convert to null
SELECT NULLIF(REGEXP_REPLACE('only string', '[^0-9]', '', 'g'), '')::bigint;
Integer range: Converting a string into integer may cause out of range for type integer error. So use bigint instead
SELECT NULLIF(REGEXP_REPLACE('98123162t3712t37', '[^0-9]', '', 'g'), '')::bigint;

sql variables

Can somebody help me figure out why the sql statement doesn't like the following line
' and ' + #SearchCat + 'like '%'+#Keywords+'%''. It has to do with the number of single quotes but I can't figure out. How do the quotes work. What's the logic?
DECLARE #strStatement varchar(550)
declare #state as varchar(50)
declare #district as varchar(50)
declare #courttype as varchar(50)
declare #SearchCat as varchar(50)
declare #KeyWords as varchar (50)
select #State ='FL'
select #district = '11'
select #courtType = '1'
select #SearchCat='CaseNumber'
select #KeyWords='File'
select #strStatement= 'SELECT CaseNumber FROM app_Case
where State ='''+ #State+
''' and District='''+ #District+
' and ' + #SearchCat + 'like '%'+#Keywords+'%''
exec (#strStatement)
I was missing a space before 'like'
You've also got the wrong number of single-quotes around your ‘%’ characters, which will confuse it.
Incidentally, you've made yourself a nice little SQL injection security hole there, from inside SQL itself! If one of the parameters contains an apostrophe your sqlStatement will break and any rogue SQL in the parameter name would be executed.
You can use the REPLACE function to double up single quotes to prevent this attack:
' AND '+QUOTENAME(#SearchCat)+' LIKE ''%'+REPLACE(#Keywords, '''', '''''')+'%''...'
(The QUOTENAME is needed if the column name contains out-of-band characters or is a reserved word.)
A cleaner (but quite verbose) approach to generating the SQL than tediously REPLACEing every string literal yourself is to use sp_executesql. For example:
SELECT #strStatement= N'
SELECT #Number= CaseNumber FROM app_Case
WHERE State=#State AND District=#District
AND '+QUOTENAME(#SearchCat)+N' LIKE ''%''+#Keywords+''%''
';
SELECT #params= N'#State varchar(50), #District varchar(50), #Keywords varchar(50), #Number int OUTPUT';
EXECUTE sp_executesql #strStatement, #params, #State, #District, #Keywords, #Number OUTPUT;
Incidentally if #searchCat can only have a small number of different values, you can use a workaround to avoid having to do any of this laborious dynamic-SQL nonsense at all:
SELECT CaseNumber FROM app_Case
WHERE State=#State AND District=#District
AND CASE #searchCat
WHEN 'searchableColumnA' THEN searchableColumnA
WHEN 'searchableColumnB' THEN searchableColumnB
END LIKE '%'+#Keywords+'%';
See this rather good exploration of dynamically-created SQL statements in T-SQL for more background and some of the risks you face.
I figure it out. I was missing a space before 'like'