Detect Column containing special characters other than space - Postgresql - postgresql

I need to find the values from a text column which have characters other than alphabets, numbers, and SPACE (It is a name column so having space is allowed).
I am trying this which is not working
select * from table where name ~ '[^a-z0-9 ]';
I have left a space between 9 and ]

The correct regular expression would be:
[^[:alnum:] ]
That will match any string that contains a character that is neither alphabetical nor numerical nor space.

Try ^[-a-z0-9 ]
I think you can use \\w instead of a-z0-9
so that looks like : [-\\w.]

Related

How to delete space in character text?

I wrote a code that automatically pulls time-related information from the system. As indicated in the table is fixed t247 Month names to 10 characters in length. But it is a bad image when showing on the report screen.
I print this way:
WRITE : 'Bugün', t_month_names-ltx, ' ayının'.
CONCATENATE gv_words-word '''nci günü' INTO date.
CONCATENATE date ',' INTO date.
CONCATENATE date gv_year INTO date SEPARATED BY space.
TRANSLATE date TO LOWER CASE.
I tried the CONDENSE t_month_names-ltx NO-GAPS. method to delete the spaces, but it was not enough.
After WRITE, I was able to write statically by setting the blank value:
WRITE : 'Bugün', t_month_names-ltx.
WRITE : 14 'ayının'.
CONCATENATE gv_words-word '''nci günü' INTO date.
CONCATENATE date ',' INTO date.
CONCATENATE date gv_year INTO date SEPARATED BY space.
TRANSLATE date TO LOWER CASE.
But this is not a correct use. How do I achieve this dynamically?
You could use a temporary field of type STRING:
DATA l_month TYPE STRING.
l_month = t_month_names-ltx.
WRITE : 'Bugün', l_month.
WRITE : 14 'ayının'.
CONCATENATE gv_words-word '''nci günü' INTO date.
CONCATENATE date ',' INTO date.
CONCATENATE date gv_year INTO date SEPARATED BY space.
TRANSLATE date TO LOWER CASE.
You can not delete trailing spaces from a TYPE C field, because it's of constant length. The unused length is always filled with spaces.
But after you assembled you string, you can use CONDENSE without NO-GAPS to remove any chains of more than one space within the string.
Add CONDENSE date. below the code you wrote and you should get the results you want.
Another option is to abandon CONCATENATE and use string templates (string literals within | symbols) for string assembly instead, which do not have the annoying habit of including trailing spaces of TYPE C fields:
DATA long_char TYPE C LENGTH 128.
long_char = 'long character field'.
WRITE |this is a { long_char } inserted without spaces|.
Output:
this is a long character field inserted without spaces

db2 remove all non-alphanumeric, including non-printable, and special characters

This may sound like a duplicate, but existing solutions does not work.
I need to remove all non-alphanumerics from a varchar field. I'm using the following but it doesn't work in all cases (it works with diamond questionmark characters):
select TRANSLATE(FIELDNAME, '?',
TRANSLATE(FIELDNAME , '', 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'))
from TABLENAME
What it's doing is the inner translate parse all non-alphanumeric characters, then the outer translate replace them all with a '?'. This seems to work for replacement character�. However, it throws The second, third or fourth argument of the TRANSLATE scalar function is incorrect. which is expected according to IBM:
The TRANSLATE scalar function does not allow replacement of a character by another character which is encoded using a different number of bytes. The second and third arguments of the TRANSLATE scalar function must end with correctly formed characters.
Is there anyway to get around this?
Edit: #Paul Vernon's solution seems to be working:
· 6005308 ??6005308
–6009908 ?6009908
–6011177 ?6011177
��6011183�� ??6011183??
Try regexp_replace(c,'[^\w\d]','') or regexp_replace(c,'[^a-zA-Z\d]','')
E.g.
select regexp_replace(c,'[^a-zA-Z\d]','') from table(values('AB_- C$£abc�$123£')) t(c)
which returns
1
---------
ABCabc123
BTW Note that the allowed regular expression patterns are listed on this page Regular expression control characters
Outside of a set, the following must be preceded with a backslash to be treated as a literal
* ? + [ ( ) { } ^ $ | \ . /
Inside a set, the follow must be preceded with a backslash to be treated as a literal
Characters that must be quoted to be treated as literals are [ ] \
Characters that might need to be quoted, depending on the context are - &

How to keep the upper case and lower case letters in a column alias in the results in Redshift

In Redshift we are trying to give more meaningful aliases to the columns we are returning from the queries as we are importing the results into TABLEAU, the issue is that RedShift turns all the letter to lower case ones, i.e. from "Event Date" it then returns "event date", any idea on how to work this one out to keep the alias given?
I know I'm a bit late to the party but for anyone else looking, you can enable case sensitivity, so if you want to return a column with camel casing for example
SET enable_case_sensitive_identifier TO true;
Then in your query wrap what you want to return the column as in double quotes
SELECT column AS "thisName"
Or as per OP's example
SELECT a.event_date AS "Event Date"
https://docs.aws.amazon.com/redshift/latest/dg/r_enable_case_sensitive_identifier.html
Edit: To have this behaviour as default for the cluster you will need to create/update a parameter group in Configurations => Workload Management. You can't change the settings for the default parameter group. Note, you will need to reboot the cluster after applying the parameter group for the changes to take effect.
No, you cannot do this in Redshift. all columns are lowercase only.
You can enforce upper case only by using
set describe_field_name_in_uppercase to on;
Also see the examples here https://docs.aws.amazon.com/redshift/latest/dg/r_names.html you can see that the upper case characters are returned as lower case. and it says "identifiers are case-insensitive and are folded to lowercase in the database"
You can of course rename the column to include uppercase within Tableau.
I was going through AWS docs for redshift and looks like INTCAP function can solve your use case
For reference => https://docs.aws.amazon.com/redshift/latest/dg/r_INITCAP.html
Brief description (copied)
The INITCAP function makes the first letter of each word in a string uppercase, and any subsequent letters are made (or left) lowercase. Therefore, it is important to understand which characters (other than space characters) function as word separators. A word separator character is any non-alphanumeric character, including punctuation marks, symbols, and control characters. All of the following characters are word separators:
! " # $ % & ' ( ) * + , - . / : ; < = > ? # [ \ ] ^ _ ` { | } ~
And in your case you have declared field name as event_date which will convert to Event_Date.
And next you can use REPLACE function to replace underscore '_'
For reference => https://docs.aws.amazon.com/redshift/latest/dg/r_REPLACE.html
You need to put
set describe_field_name_in_uppercase to on;
in your Tableau's Initial SQL.

Remove/replace special characters in column values?

I have a table column containing values which I would like to remove all the hyphens from. The values may contain more than one hyphen and vary in length.
Example: for all values I would like to replace 123 - ABCD - efghi with 123ABCDefghi.
What is the easiest way to remove all hyphens & update all column values in the table?
You can use the regexp_replace function to left only the digits and letters, like this:
update mytable
set myfield = regexp_replace(myfield, '[^\w]+','');
Which means that everything that is not a digit or a letter or an underline will be replaced by nothing (that includes -, space, dot, comma, etc).
If you want to also include the _ to be replaced (\w will leave it) you can change the regex to [^\w]+|_.
Or if you want to be strict with the characters that must be removed you use: [- ]+ in this case here a dash and a space.
Also as suggested by Luiz Signorelly you can use to replace all occurrences:
update mytable
set myfield = regexp_replace(myfield, '[^\w]+','','g');
You can use this.
update table
set column = format('%s%s', left(column, 3), right(column, -6));
Before:
After:

T-SQL trim not working - let spaces on the result

I have a trim function that apply ltrim and rtrim
CREATE FUNCTION dbo.TRIM(#string VARCHAR(MAX))
RETURNS VARCHAR(MAX)
BEGIN
RETURN LTRIM(RTRIM(#string))
END
GO
I do the following query:
SELECT distinct dbo.trim([subject]) as subject
FROM [DISTR]
The result has rows like:
"A"
"A "
"B"
...
I thought that thoose chars maybe weren't spaces but when I got the ascii code, it returns 32 which is the code for space.
My only guess is that I had to change the collaction of the database to: SQL_Latin1_General_CP1_CI_AI
Can that be the problem? Any ideas?
Thanks
Maybe your field contains more than spaces. Remember than " " could be a space, tab, and many other "blank" chars. It's possible to match it using ASCII or building a CLR implementation of trim that uses regular expressions