In transact sql i have:
DECLARE #phrase='KeyWord1 KeyWord2 ,KeyWord3 ' -- and my be more separated by space,comma or ;(but mainly by space=it's a phrase)
I have a table Students
Students
(
StudentId bigint,
FullName nvarchar(50),
Article nvarchar(max)
)
I want to filter students by articles by bringing those whom article conatains a word of #phrase
Something like:
DECLARE #WOrdTable TABLE
(
Word nvarchar(50)
)
INSERT INTO #WOrdTable
SELECT WOrd of #phrase
SELECT *
FROM Students
WHERE Article LIKE (Word in #phrase)
I would split your string (comma delimited) into a temp table on your word phrases and perform a join to the Students table. From there you can make better use of the data than you would have in string format
There are plenty ways of splitting a string into a table:
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=50648
Once you have your temp table you can use something like this.
SELECT S.*
FROM Students S (NOLOCK)
JOIN #tmpArticles A
ON S.Articles LIKE '%' + A.Article '%'
A word of caution though, using LIKE on %X% has terrible performance, so question your approach if you have a LOT of string data.
This problem seems more geared towards a Full Text Search approach (FTS)
http://msdn.microsoft.com/en-us/library/ms142571.aspx
Related
Using SQL Server 2016 and referring to this article:
https://www.sqlshack.com/dynamic-pivot-tables-in-sql-server/
That article uses this pivot:
SELECT * FROM (
SELECT
[Student],
[Subject],
[Marks]
FROM Grades
) StudentResults
PIVOT (
SUM([Marks])
FOR [Subject]
IN (
[Mathematics],
[Science],
[Geography]
)
) AS PivotTable
How can you change the query so that the Subjects ([Mathematics], [Science], [Geography]) don't have to be hardcoded in the query?
Can you rather get the Subject list using a subquery? How do you get the FOR to work with a query like this?
...
FOR [Subject]
IN (
SELECT subject FROM grades WHERE student = "Jacob"
)
How can you change the query so that the Subjects ([Mathematics], [Science], [Geography]) don't have to be hardcoded in the query?
You can't; you'll have to form the SQL as a string and execute it dynamically
SQL makes it easy to have a variable number of columns (you just write more words in a SELECT), which then also makes it easy to forget that columns are like properties of an object (and an entire row is like an instance of an object); they aren't something that vary dynamically every time you run a program. As a Person you don't have a Name this week and not next week.
The number of columns output from a query isn't meant to vary; the number of rows is. If you want variable numbers of attributes, you'll have to form them as rows and then have your front end behave differently to account for them (i.e. don't do the pivot). If you can't do this because you have no front end, and you really do need a varying number of columns, you have to write a different SQL each time (which you can do by concatenating together a new SQL string and EXECing it, but be under no illusions - it works because it's a totally different SQL/the programmatic equivalent of you editing your hardcoded query and re-running it)
It looks something like (not tested - consider this pseudocode):
DECLARE #sql VARCHAR(4000) = CONCAT('
SELECT * FROM (
SELECT
[Student],
[Subject],
[Marks]
FROM Grades
) StudentResults
PIVOT (
SUM([Marks])
FOR [Subject]
IN (',
SELECT STRING_AGG(Subject, ',') FROM (SELECT DISTINCT QUOTENAME(Subject) FROM Grades) x,
' )
) AS PivotTable'
) --end concat
EXEC #sql
I have a table with column Country_City which include a combinations of Countries and Cities separate with :, example Egypt: Cairo and i want to split them in 2 different columns, Country & City.
I manage to fulfill this task with SUBSTRING & CHARINDEX functions but i m searching for another solution if any.
Any opinions? Thanks in advance.
There are several approaches, but - to be honest - only one good choice: You should never ever store these values in one single column.
If you have to stick with this (legacy issue) or if you need this code in order to clean this bad structure, you may check one of these:
First a mockup table to simulate your issue:
DECLARE #tbl TABLE(ID INT IDENTITY, Country_Region NVARCHAR(1000));
INSERT INTO #tbl VALUEs('Egypt: Cairo'),('Germany: Berlin');
--Fastest in most cases will be this:
SELECT t.*
,TRIM(LEFT(t.Country_Region,A.PosColon-1)) AS Country
,TRIM(SUBSTRING(t.Country_Region,A.PosColon+1,1000)) AS Region
FROM #tbl t
CROSS APPLY(SELECT CHARINDEX(':',t.Country_Region) PosColon) A;
--Easy to read and good to use with more than two items per string (but rather slow)
SELECT t.*
,A.CastedToXml.value('/x[1]','nvarchar(max)') AS Country
,A.CastedToXml.value('/x[2]','nvarchar(max)') AS Region
FROM #tbl t
CROSS APPLY(SELECT CAST('<x>' + REPLACE(t.Country_Region,': ','</x><x>') + '</x>' AS XML) CastedToXml) A;
--Needs v2016, but is very fast, easy to read and easy to up-scale
SELECT t.*
,JSON_VALUE(A.AsJSON,'$[0]') AS Country
,JSON_VALUE(A.AsJSON,'$[1]') AS Region
FROM #tbl t
CROSS APPLY(SELECT CONCAT('["',REPLACE(t.Country_Region,': ','","'),'"]') AsJSON) A;
All of them produce the same output
ID Country_Region Country Region
1 Egypt: Cairo Egypt Cairo
2 Germany: Berlin Germany Berlin
I have a table that I need to delete random words/characters out of. To do this, I have been using a regexp_replace function with the addition of multiple patterns. An example is below:
select regexp_replace(combined,'\y(NAME|001|CONTAINERS:|MT|COUNT|PCE|KG|PACKAGE)\y','', 'g')
as description, id from export_final;
However, in the full list, there are around 70 different patterns that I replace out of the description. As you can imagine, the code if very cluttered: This leads me to my question. Is there a way to put these patterns into another table then use that table to check the descriptions?
Of course. Populate your desired 'other' table with what patterns you need. Then create a CTE that uses string_agg function to build the regex. Example:
create table exclude_list( pattern_word text);
insert into exclude_list(pattern_word)
values('NAME'),('001'),('CONTAINERS:'),('MT'),('COUNT'),('PCE'),('KG'),('PACKAGE');
with exclude as
( select '\y(' || string_agg(pattern_word,'|') || ')\y' regex from exclude_list )
-- CTE simulates actual table to provide test data
, export_final (id,combined) as (values (0,'This row 001 NAME Main PACKAGE has COUNT 3 units'),(1,'But single package can hold 6 KG'))
select regexp_replace(combined,regex,'', 'g')
as description, id
from export_final cross join exclude;
Server: SQL Server 2008 R2
I apologize in advance, as I'm not sure of the best way to verbalize the question. I'm receiving a string of email addresses and I need to see if, within that string, any of the addresses exist as a user already. The query that obviously doesn't work is shown below, but hopefully it helps to clarify what I'm looking for:
SELECT f_emailaddress
FROM tb_users
WHERE f_emailaddress LIKE '%user1#domain.com,user2#domain.com%'
I was hoping SQL had an "InString" operator, that would check for matches "within the string", but I my Google abilities must be weak today.
Any assistance is greatly appreciated. If there simply isn't a way, I'll have to dig in and do some work in the codebehind to split each item in the string and search on each one.
Thanks in advance,
Beems
Split the input string and use IN clause
to split the CSV to rows use this.
SELECT Ltrim(Rtrim(( Split.a.value('.', 'VARCHAR(100)') )))
FROM (SELECT Cast ('<M>'
+ Replace('user1#domain.com,user2#domain.com', ',', '</M><M>')
+ '</M>' AS XML) AS Data) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
Now use the above query in where clause.
SELECT f_emailaddress
FROM tb_users
WHERE f_emailaddress IN(SELECT Ltrim(Rtrim(( Split.a.value('.', 'VARCHAR(100)') )))
FROM (SELECT Cast ('<M>'
+ Replace('user1#domain.com,user2#domain.com', ',', '</M><M>')
+ '</M>' AS XML) AS Data) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a))
Or use can use Inner Join
SELECT f_emailaddress
FROM tb_users A
JOIN (SELECT Ltrim(Rtrim(( Split.a.value('.', 'VARCHAR(100)') )))
FROM (SELECT Cast ('<M>'
+ Replace('user1#domain.com,user2#domain.com', ',', '</M><M>')
+ '</M>' AS XML) AS Data) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)) B
ON a.f_emailaddress = b.f_emailaddress
You first need to split the CSV list into a temp table and then use that to INNER JOIN with your existing table, as that will act as a filter.
You cannot use CONTAINS unless you have created a Full Text index on that table and column, which I doubt is the case here.
For example:
CREATE TABLE #EmailAddresses (Email NVARCHAR(500) NOT NULL);
INSERT INTO #EmailAddress (Email)
SELECT split.Val
FROM dbo.Splitter(#IncomingListOfEmailAddresses);
SELECT usr.f_emailaddress
FROM tb_users usr
INNER JOIN #EmailAddresses tmp
ON tmp.Email = usr.f_emailaddress;
Please note that the reference to "dbo.Splitter" is a placeholder for whatever string splitter you already have or might get. Please do not use any splitter that makes use of a WHILE loop. The best options are either the SQLCLR- or XML- based ones. The XML-based ones are generally fast but do have some issues with encoding if the string to be split has special XML characters such as &, <, or ". If you want a quick and easy SQLCLR-based splitter, you can download the Free version of the SQL# library (which I am the creator of, but this feature is in the free version) which contains String_Split and String_Split4k (for when the input is always <= 4000 characters).
SQL has a CONTAINS and an IN function. You can use either of those to accomplish your task. Click on either for more information via MSDNs website! Hope this helps.
CONTAINS
CONTAINS will look to see if any values in your data contain the entire string you provided. Kind of similar in presentations to LIKE '%myValue%';
SELECT f_emailaddress
FROM tb_users
WHERE CONTAINS (f_emailaddress, 'user1#domain.com');
IN
IN will return matches for any values in the provided comma delimited list. They need to be exact matches however. You can't provide partial terms.
SELECT f_emailaddress
FROM tb_users
WHERE f_emailaddress IN ('user1#domain.com','user2#domain.com')
As far as splitting each of the values out into separate strings, have a look at the StackOverflow question found HERE. This might point you in the proper direction.
You can try like this(not tested).
Before using this, make sure that you have created a Full Text index on that table and column.
Replace your comma with AND then
SELECT id,email
FROM t
where CONTAINS(email, 'user1#domain.com and user2#domain.com');
--prepare temp table for testing
DECLARE #tb_users AS TABLE
(f_emailaddress VARCHAR(100))
INSERT INTO #tb_users
( f_emailaddress)
VALUES ( 'user1#domain.com' ),
( 'user2#domain.com' ),
( 'user3#domain.com' ),
( 'user4#domain.com' )
--Your query
SELECT f_emailaddress
FROM #tb_users
WHERE 'user1#domain.com,user2#domain.com' LIKE '%' + f_emailaddress + '%'
here is table structure
table1
pk int, email character varying(100)[]
data
1, {'mr_a#gmail.com', 'mr_b#yahoo.com', 'mr_c#postgre.com'}
what i try to achieve is find any 'gmail' from record
query
select * from table1 where any(email) ilike '%gmail%';
but any() can only be in left-side and unnest() might slow down performance. anyone have any idea?
edit
actually i kinda confuse a bit when i first post. i try to achieve through any(array[]).
this is my actual structure
pk int,
code1 character varying(100),
code2 character varying(100),
code3 character varying(100), ...
my first approch is
select * from tabl1 where code1 ilike '%code%' or code2 ilike '%code%' or...
then i try
select * from table1 where any(array[code1, code2, ...]) ilike '%code%'
which is not working.
Create an operator that implements ILIKE "backwards", e.g.:
CREATE FUNCTION backward_texticlike(text, text) RETURNS booleans
STRICT IMMUTABLE LANGUAGE SQL
AS $$ SELECT texticlike($2, $1) $$;
CREATE OPERATOR !!!~~* (
PROCEDURE = backward_texticlike,
LEFTARG = text,
RIGHTARG = text,
COMMUTATOR = ~~*
);
(Note that ILIKE internally corresponds to the operator ~~*. Pick your own name for the reverse.)
Then you can run
SELECT * FROM table1 WHERE '%code%' !!!~~* ANY(ARRAY[code1, code2, ...]);
Store email addresses in a normalized table structure. Then you can avoid the expense of unnest, have "proper" database design, and take full advantage of indexing. If you're looking to do full text style queries, you should be storing your email addresses in a table and then using a tsvector datatype so you can perform full text queries AND use indexes. ILIKE '%whatever%' is going to result in a full table scan since the planner can't take advantage of any query. With your current design and a sufficient number of records, unnest will be the least of your worries.
Update Even with the updates to the question, using a normalized codes table will cause you the least amount of headache and result in optimal scans. Anytime that you find yourself creating numbered columns, it's a good indication that you might want to normalize. That being said, you can create a computed text column to use as a search words column. In your case you could create a search_words column that is populated on insert and update by a trigger. Then you can create a tsvector to build full text queries on the search_words