T-SQL find string with lowercase and uppercase - tsql

I have a database with several tables and I need to search every varchar column across the database, for columns that simultaneously contain lower and upper case characters.
To clarify:
If one column contains helLo the name of the column should be returned by the query, but if the column values only contain either hello or HELLO then the name of the column is not returned.

Let's exclude all UPPER and all LOWER, the rest will be MIXED.
SELECT someColumn
FROM someTable
WHERE someColumn <> UPPER(someColumn) AND someColumn <> LOWER(someColumn)
EDIT:
As suggested in comments and described in detail here I need to specify a case-sensitive collation.
SELECT someColumn
FROM someTable
WHERE someColumn <> UPPER(someColumn) AND
someColumn <> LOWER(someColumn)
Collate SQL_Latin1_General_CP1_CS_AS

It sounds like you are after a case sensitive search, so you'd need to use a case sensitive collation for there WHERE clause.
e.g. if your collation is currently SQL_Latin1_General_CP1_CI_AS which is case insensitive, you can write a case sensitive query using:
SELECT SomeColumn
FROM dbo.SomeTable
WHERE SomeField LIKE '%helLo%' COLLATE SQL_Latin1_General_CP1_CS_AS
Here, COLLATE SQL_Latin1_General_CP1_CS_AS tells it to use a case sensitive collation to perform the filtering.

I think I understand that you want to find any varchar column with mixed case data within it?
If so, you can achieve this with a cursor looking at your column types, which then executes some dynamic SQL on the varchar columns it finds to check the data for mixed case values.
I thoroughly recommend doing this on a non-production server using a copy of your database, not least because you need to create a table to deposit your findings into:
create table VarcharColumns (TableName nvarchar(max), ColumnName nvarchar(max))
declare #sql nvarchar(max)
declare my_cursor cursor local static read_only forward_only
for
select 'insert into VarcharColumns select t,c from(select ''' + s.name + '.' + tb.name + ''' t, ''' + c.name + ''' c from ' + s.name + '.' + tb.name + ' where ' + c.name + ' like ''%[abcdefghijklmnopqrstuvwxyz]%'' COLLATE SQL_Latin1_General_CP1_CS_AS and ' + c.name + ' like ''%[ABCDEFGHIJKLMNOPQRSTUVWXYZ]%'' COLLATE SQL_Latin1_General_CP1_CS_AS having count(1) > 0) a' as s
from sys.columns c
inner join sys.types t
on(c.system_type_id = t.system_type_id
and t.name = 'varchar'
)
inner join sys.tables tb
on(c.object_id = tb.object_id)
inner join sys.schemas s
on(tb.schema_id = s.schema_id)
open my_cursor
fetch next from my_cursor into #sql
while ##fetch_status = 0
begin
print #sql
exec(#sql)
fetch next from my_cursor into #sql
end
close my_cursor
deallocate my_cursor
select * from VarcharColumns

You can check the hash compared to its upper and lower values... here is a simple test:
declare #test varchar(256)
set #test = 'MIX' -- Try changing this to a mix case, and then all lower case
select case
when hashbytes('SHA1',#test) <> hashbytes('SHA1',upper(#test)) and hashbytes('SHA1',#test) <> hashbytes('SHA1',lower(#test))
then 'MixedCase'
else 'Not Mixed Case'
end
So using this in a table... you can do something like this
create table #tempT (SomeColumn varchar(256))
insert into #tempT (SomeColumn) values ('some thing lower'),('SOME THING UPPER'),('Some Thing Mixed')
SELECT SomeColumn
FROM #tempT
WHERE 1 = case
when hashbytes('SHA1',SomeColumn) <> hashbytes('SHA1',upper(SomeColumn)) and hashbytes('SHA1',SomeColumn) <> hashbytes('SHA1',lower(SomeColumn)) then 1
else 0
end

Related

Concatenate string instead of just replacing it

I have a table with standard columns where I want to perform regular INSERTs.
But one of the columns is of type varchar with special semantics. It's a string that's supposed to behave as a set of strings, where the elements of the set are separated by commas.
Eg. if one row has in that varchar column the value fish,sheep,dove, and I insert the string ,fish,eagle, I want the result to be fish,sheep,dove,eagle (ie. eagle gets added to the set, but fish doesn't because it's already in the set).
I have here this Postgres code that does the "set concatenation" that I want:
SELECT string_agg(unnest, ',') AS x FROM (SELECT DISTINCT unnest(string_to_array('fish,sheep,dove' || ',fish,eagle', ','))) AS x;
But I can't figure out how to apply this logic to insertions.
What I want is something like:
CREATE TABLE IF NOT EXISTS t00(
userid int8 PRIMARY KEY,
a int8,
b varchar);
INSERT INTO t00 (userid,a,b) VALUES (0,1,'fish,sheep,dove');
INSERT INTO t00 (userid,a,b) VALUES (0,1,',fish,eagle')
ON CONFLICT (userid)
DO UPDATE SET
a = EXCLUDED.a,
b = SELECT string_agg(unnest, ',') AS x FROM (SELECT DISTINCT unnest(string_to_array(t00.b || EXCLUDED.b, ','))) AS x;
How can I achieve something like that?
Storing comma separated values is a huge mistake to begin with. But if you really want to make your life harder than it needs to be, you might want to create a function that merges two comma separated lists:
create function merge_lists(p_one text, p_two text)
returns text
as
$$
select string_agg(item, ',')
from (
select e.item
from unnest(string_to_array(p_one, ',')) as e(item)
where e.item <> '' --< necessary because of the leading , in your data
union
select t.item
from unnest(string_to_array(p_two, ',')) t(item)
where t.item <> ''
) t;
$$
language sql;
If you are using Postgres 14 or later, unnest(string_to_array(..., ',')) can be replace with string_to_table(..., ',')
Then your INSERT statement gets a bit simpler:
INSERT INTO t00 (userid,a,b) VALUES (0,1,',fish,eagle')
ON CONFLICT (userid)
DO UPDATE SET
a = EXCLUDED.a,
b = merge_lists(excluded.b, t00.b);
I think I was only missing parentheses around the SELECT statement:
INSERT INTO t00 (userid,a,b) VALUES (0,1,',fish,eagle')
ON CONFLICT (userid)
DO UPDATE SET
a = EXCLUDED.a,
b = (SELECT string_agg(unnest, ',') AS x FROM (SELECT DISTINCT unnest(string_to_array(t00.b || EXCLUDED.b, ','))) AS x);

Dynamic SQL with EXECUTE and nested format()

I tried doing this with just a normal CTE (no EXECUTE ... FORMAT) and I was able to do it.
I basically have a table that has some 5 columns and some data in each column that I wanted to concatenate and such to manifest/generate some data in a new column.
I can do something like this and it works:
WITH cte AS (SELECT *, case
when var1 = '' then ''
when var2 = '' then ''
else '' end, 'adding_dummy_text_column'
FROM some_other_table sot
WHERE sot.type = 'java')
INSERT INTO my_new_table
SELECT *, 'This is the new column I want to make with some data from my CTE' || c.type
FROM cte c;
So this works as I said. I'll end up getting a new table that has an extra column which a hardcoded, concatenated string This is the new column I want to make with some data from my CTE Java
Of course, whatever is in the c.type column for the corresponding row in the CTE as it loads the SELECT is what gets concatenated to that string.
The problem is, as soon as I start using the EXECUTE...FORMAT to make it cleaner and have more power to concatenate/combined different data pieces from my different columns (I have data kind of scattered around in bad formats and I'm populating a fresh new table), it's as if the FORMAT arguments or the variables cannot detect the CTE table.
This is how I'm doing it
EXECUTE FORMAT ('WITH cte AS (SELECT *, case
when var1 = %L then %L
when var2 = '' '' then '' ''
else '' '' end, ''adding_dummy_text_column''
FROM some_other_table sot
WHERE sot.type = ''java'')
INSERT INTO my_new_table
SELECT *, ''This is the new column I want to make with some data from my CTE %I''
FROM cte c', 'word1', 'word2', c.type
);
OK, so I know I used the empty string '' '' in this example and the %L but i just wanted to show I had no issues with any of that. Its when I try to reference my CTE columns, so you can see I'm trying to do the same concatenation but by leveraging the EXECUTE...FORMAT and using the %I identifiers. So, the first 2 args are just fine, its the c.type that just no matter what column I try, doesn't work. Also, I removed the c alias and didn't get any better luck. It's 100% anytime I reference the columns on the CTE though, as I have removed all that code and it runs just fine without that.
But yeah, is there any work around? I really want to transform some data and now have to do the || for concatenation.
This should do it:
EXECUTE format($f$WITH cte AS (SELECT *, CASE
WHEN var1 = %L THEN %L
WHEN var2 = ' ' THEN ' '
ELSE ' ' END, 'adding_dummy_text_column'
FROM some_other_table sot
WHERE sot.type = 'java')
INSERT INTO my_new_table
SELECT *, format('This is the new column I want to make with some data from my CTE %%I', c.type)
FROM cte c$f$
, 'word1', 'word2');
There are two levels. You need a second format() that's executed by the dynamic SQL string that's been concatenated by the first format().
I simplified with dollar-quoting. See:
Insert text with single quotes in PostgreSQL
The nested % character must be escaped by doubling it up: %%I. See:
What does %% in PL/pgSQL mean?
Generates and executes this SQL statement:
WITH cte AS (SELECT *, CASE
WHEN var1 = 'word1' THEN 'word2'
WHEN var2 = ' ' THEN ' '
ELSE ' ' END, 'adding_dummy_text_column'
FROM some_other_table sot
WHERE sot.type = 'java')
INSERT INTO my_new_table
SELECT *, format('This is the new column I want to make with some data from my CTE %I', c.type)
FROM cte c
Which could be simplified and improved to this equivalent:
INSERT INTO my_new_table(co1, col2, ...) -- provide target column list!
SELECT *
, CASE WHEN var1 = 'word1' THEN 'word2'
WHEN var2 = ' ' THEN ' '
ELSE ' ' END
, 'adding_dummy_text_column'
, format('This is the new column I want to make with some data from my CTE %I', sot.type)
FROM some_other_table sot
WHERE sot.type = 'java'
About the missing target column list:
Cannot create stored procedure to insert data: type mismatch for serial column
You can one command, create and insert into new table!
create table my_new_table as
select *
, CASE WHEN var1 = 'hello' THEN 'word2'
WHEN var2 = ' ' THEN 'var2 is empty'
ELSE ' ' END adding_dummy_text_column
, format(
'This is the new column I want to make with some data from my CTE %I', sot.type)
from some_other_table sot
where sot.type ='java';

t-sql select column names from all tables where there is at least 1 null value

Context: I am exploring a new database (in MS SQL server), and I want to know for each table, all columns that have null values.
I.e. result would look something like this:
table column nulls
Tbl1 Col1 8
I have found this code here on stackoverflow, that makes a table of table-columnnames - without the WHERE statement which is my addition.
I tried to filter for nulls in WHERE statement, but then the table ends up empty, and I see why - i am checking if the col name is actually null, and not its contents. But can't figure out how to proceed.
select schema_name(tab.schema_id) as schema_name,
tab.name as table_name,
col.name as column_name
from sys.tables as tab
inner join sys.columns as col
on tab.object_id = col.object_id
left join sys.types as t
on col.user_type_id = t.user_type_id
-- in this where statement, I am trying to filter for nulls, but i get an empty result. and i know there are nulls
where col.name is null
order by schema_name, table_name, column_id
I also tried this (see 4th line):
select schema_name(tab.schema_id) as schema_name,
tab.name as table_name,
col.name as column_name
,(select count(*) from tab.name where col.name is null) as countnulls
from sys.tables as tab
inner join sys.columns as col
on tab.object_id = col.object_id
left join sys.types as t
on col.user_type_id = t.user_type_id
order by schema_name, table_name, column_id
the last one returns an error "Invalid object name 'tab.name'."
column name can't be null but if you mean nullable column (column that accept null value) that has null value at least so you can use following statement:
declare #schema varchar(255), #table varchar(255), #col varchar(255), #cmd varchar(max)
DECLARE getinfo cursor for
SELECT schema_name(tab.schema_id) as schema_name,tab.name , col.name from sys.tables as tab
inner join sys.columns as col on tab.object_id = col.object_id
where col.is_nullable =1
order by schema_name(tab.schema_id),tab.name,col.name
OPEN getinfo
FETCH NEXT FROM getinfo into #schema,#table,#col
WHILE ##FETCH_STATUS = 0
BEGIN
set #schema = QUOTENAME(#schema)
set #table = QUOTENAME(#table)
set #col = QUOTENAME(#col)
SELECT #cmd = 'IF EXISTS (SELECT 1 FROM '+ #schema +'.'+ #table +' WHERE ' + #col + ' IS NULL) BEGIN SELECT '''+#schema+''' as schemaName, '''+#table+''' as tablename, '''+#col+''' as columnName, * FROM '+ #schema +'.'+ #table +' WHERE ' + #col + ' IS NULL end'
EXEC(#cmd)
FETCH NEXT FROM getinfo into #schema,#table,#col
END
CLOSE getinfo
DEALLOCATE getinfo
that use cursor on all nullable columns in every table in the Database then check if this column has at least one null value if yes will select schema Name, table name, column name and all records that has null value in this column
but if you want to get only count of nulls you can use the following statement:
declare #schema varchar(255), #table varchar(255), #col varchar(255), #cmd varchar(max)
DECLARE getinfo cursor for
SELECT schema_name(tab.schema_id) as schema_name,tab.name , col.name from sys.tables as tab
inner join sys.columns as col on tab.object_id = col.object_id
where col.is_nullable =1
order by schema_name(tab.schema_id),tab.name,col.name
OPEN getinfo
FETCH NEXT FROM getinfo into #schema,#table,#col
WHILE ##FETCH_STATUS = 0
BEGIN
set #schema = QUOTENAME(#schema)
set #table = QUOTENAME(#table)
set #col = QUOTENAME(#col)
SELECT #cmd = 'IF EXISTS (SELECT 1 FROM '+ #schema +'.'+ #table +' WHERE ' + #col + ' IS NULL) BEGIN SELECT '''+#schema+''' as schemaName, '''+#table+''' as tablename, '''+#col+''' as columnName, count(*) as nulls FROM '+ #schema +'.'+ #table +' WHERE ' + #col + ' IS NULL end'
EXEC(#cmd)
FETCH NEXT FROM getinfo into #schema,#table,#col
END
that use cursor on all nullable columns in every table in the Database then check if this column has at least one null value if yes will select schema Name, table name, column name and count all records that has null value in this column

Removing all the Alphabets from a string using a single SQL Query [duplicate]

I'm currently doing a data conversion project and need to strip all alphabetical characters from a string. Unfortunately I can't create or use a function as we don't own the source machine making the methods I've found from searching for previous posts unusable.
What would be the best way to do this in a select statement? Speed isn't too much of an issue as this will only be running over 30,000 records or so and is a once off statement.
You can do this in a single statement. You're not really creating a statement with 200+ REPLACEs are you?!
update tbl
set S = U.clean
from tbl
cross apply
(
select Substring(tbl.S,v.number,1)
-- this table will cater for strings up to length 2047
from master..spt_values v
where v.type='P' and v.number between 1 and len(tbl.S)
and Substring(tbl.S,v.number,1) like '[0-9]'
order by v.number
for xml path ('')
) U(clean)
Working SQL Fiddle showing this query with sample data
Replicated below for posterity:
create table tbl (ID int identity, S varchar(500))
insert tbl select 'asdlfj;390312hr9fasd9uhf012 3or h239ur ' + char(13) + 'asdfasf'
insert tbl select '123'
insert tbl select ''
insert tbl select null
insert tbl select '123 a 124'
Results
ID S
1 390312990123239
2 123
3 (null)
4 (null)
5 123124
CTE comes for HELP here.
;WITH CTE AS
(
SELECT
[ProductNumber] AS OrigProductNumber
,CAST([ProductNumber] AS VARCHAR(100)) AS [ProductNumber]
FROM [AdventureWorks].[Production].[Product]
UNION ALL
SELECT OrigProductNumber
,CAST(STUFF([ProductNumber], PATINDEX('%[^0-9]%', [ProductNumber]), 1, '') AS VARCHAR(100) ) AS [ProductNumber]
FROM CTE WHERE PATINDEX('%[^0-9]%', [ProductNumber]) > 0
)
SELECT * FROM CTE
WHERE PATINDEX('%[^0-9]%', [ProductNumber]) = 0
OPTION (MAXRECURSION 0)
output:
OrigProductNumber ProductNumber
WB-H098 098
VE-C304-S 304
VE-C304-M 304
VE-C304-L 304
TT-T092 092
RichardTheKiwi's script in a function for use in selects without cross apply,
also added dot because in my case I use it for double and money values within a varchar field
CREATE FUNCTION dbo.ReplaceNonNumericChars (#string VARCHAR(5000))
RETURNS VARCHAR(1000)
AS
BEGIN
SET #string = REPLACE(#string, ',', '.')
SET #string = (SELECT SUBSTRING(#string, v.number, 1)
FROM master..spt_values v
WHERE v.type = 'P'
AND v.number BETWEEN 1 AND LEN(#string)
AND (SUBSTRING(#string, v.number, 1) LIKE '[0-9]'
OR SUBSTRING(#string, v.number, 1) LIKE '[.]')
ORDER BY v.number
FOR
XML PATH('')
)
RETURN #string
END
GO
Thanks RichardTheKiwi +1
Well if you really can't use a function, I suppose you could do something like this:
SELECT REPLACE(REPLACE(REPLACE(LOWER(col),'a',''),'b',''),'c','')
FROM dbo.table...
Obviously it would be a lot uglier than that, since I only handled the first three letters, but it should give the idea.

Conversion failed when converting the varchar value '1, 2, 3' to data type int

I can't seem to find a solution to this problem. Following is my query:
Declare #MY_STUDENT_ID Varchar(100)
Select #MY_STUDENT_ID = COALESCE(#MY_STUDENT_ID + ',', '') + Convert(varchar, STUDENT_ID) From Some_TABLE Where FISCAL_YEAR = '2014'
SELECT * FROM Table_Students WHERE STUDENT_ID IN (#MY_STUDENT_ID)
Basically first query runs and give me all student IDs as a string concatenated with , for e.g. 1,2,3
And then this value is passed into second query but second query is giving this error which I have posted in title. No idea what to do so any help will be appreciated.
Type of STUDENT_ID field is int.
There is absolutely no need to mess about with comma delimited lists here.
Just use the sub query directly
SELECT *
FROM Table_Students
WHERE STUDENT_ID IN (SELECT StudentId
From Some_TABLE Where FISCAL_YEAR = '2014')
Your approach does not work as it ends up generating something with semantics of
SELECT *
FROM Table_Students
WHERE STUDENT_ID IN ('1,2,3')
Which is not the same as
SELECT *
FROM Table_Students
WHERE STUDENT_ID IN (1,2,3)
As it just is a single string parameter with contents that happen to resemble an in list, rather than 3 int parameters.
You could do so using dynamic SQL, but in this scenario, Martin SMith's answer seems to be better. Should you, however, wish to use dynamic SQL, this would be the way to do so (untested pseudo-code):
Declare #MY_STUDENT_ID varchar(100);
DECLARE #sql nvasrchar(max);
Select #MY_STUDENT_ID = COALESCE(#MY_STUDENT_ID + ',', '') + Convert(varchar, STUDENT_ID) From Some_TABLE Where FISCAL_YEAR = '2014'
SELECT #sql = 'SELECT * FROM Table_Students WHERE STUDENT_ID IN (' + #MY_STUDENT_ID + ')';
EXEC sp_executesql #sql;