How to join two tables for phone codes - tsql

I have the two tables.
The first table tbl1:
name nvarchar(255)
number nvarchar(255)
The second table dbo.phone_codes:
country nvarchar(255)
code nvarchar(4)
The first query:
Select Name, Number
from dbo.tbl1
I am getting this result:
User1 375xxxxxxxx
User1 7xxxxxxxxxx
User2 49xxxxxxxxx
The second query:
select country, code
from dbo.phone_codes
I am getting result:
Belarus 375
Russian 7
Germany 49
Poland 48
What should I use the query if I want get this result:
User1 37552222222 Belarus 375
User1 77333333333 Russian 7
User2 49111111111 Germany 49
The first table:
name - nvarchar(255)
number - nvarchar(255)
The second table:
country - nvarchar(255)
code - nvarchar(4)

Try this
SELECT
t.Name, t.Number, p.country, p.code
FROM dbo.tbl1 t
INNER JOIN dbo.phone_codes p
ON t.Number LIKE p.code + '%'

SELECT
t.Name, t.Number, p.Country, p.Cpde
FROM
dbo.tbl1 t, dbo.phone_codes p
WHERE
charindex(p.Code, t.Number) = 1

Assuming your phone number and country code consists only of numbers, no spaces, brackets, dashes or plus sign. You can try something like this:
SELECT *
FROM(
SELECT T.Name ,
T.Number ,
P.country ,
P.code ,
RANK() OVER( PARTITION BY T.Number
ORDER BY ISNULL(CAST(P.code AS int), 1) DESC)RNK
FROM
dbo.tbl1 T LEFT JOIN dbo.phone_codes P
ON T.Number LIKE P.Code + '%'
)A
WHERE A.RNK = 1;
If you have special characters, you need to use replace function to remove any non-numeric characters.
Rank function is used to resolve the cases like Bermuda (1441) and US(1).

Related

How to create a comparison chart with feature count in PostgreSQL?

I have OpenStreetMap data loaded to a PostgreSQL table. A hstore type column contains all of the tags. I would like to make a comparison chart to see how many records has name, name:en, name:bg tags for example. The result I would like to see is something like this:
I can achieve this manually using this query:
SELECT 1 AS id, '+' AS name, NULL AS "name:en", NULL AS "name:bg", count(*) FROM public.ways WHERE exist(tags,'name') UNION
SELECT 2 AS id, NULL AS name, '+' AS "name:en", NULL AS "name:bg", count(*) FROM public.ways WHERE exist(tags,'name:en') UNION
SELECT 3 AS id, NULL AS name, NULL AS "name:en", '+' AS "name:bg", count(*) FROM public.ways WHERE exist(tags,'name:bg') UNION
SELECT 4 AS id, '+' AS name, '+' AS "name:en", NULL AS "name:bg", count(*) FROM public.ways WHERE exist(tags,'name') AND exist(tags,'name:en') UNION
SELECT 5 AS id, '+' AS name, NULL AS "name:en", '+' AS "name:bg", count(*) FROM public.ways WHERE exist(tags,'name') AND exist(tags,'name:bg') UNION
SELECT 6 AS id, '+' AS name, '-' AS "name:en", NULL AS "name:bg", count(*) FROM public.ways WHERE exist(tags,'name') AND NOT exist(tags,'name:en') UNION
SELECT 7 AS id, '-' AS name, '+' AS "name:en", NULL AS "name:bg", count(*) FROM public.ways WHERE NOT exist(tags,'name') AND exist(tags,'name:en')
ORDER BY id
I consider this unnecessarily long and overcomplicated, plus I have to do it manually. I know there are some possibilities using the crosstab function, but I couldn't get it working. Based on the answer to this question I was able to create something like this:
SELECT * FROM crosstab(
'SELECT tags::text~''"name"=>".*"'' as a, tags::text~''"name:en"=>".*"'' as b, tags::text~''"name_int"=>".*"'' as c FROM public.ways')
AS ct (name boolean,"name:en" boolean, "name:bg" boolean)
GROUP BY name,"name:en","name:bg"
My problem is that I cannot seem to add a count column to this, and that it does not contain options where only one of the three condition is taken into account.
Any idea how could I solve this problem, or any direction where should I start?
Example data lines:
1 "name"=>"dm"
2 "name"=>"Ешекчи дере", "name:en"=>"Khatak Dere River"
3 "name:en"=>"Sushitsa"
4 "name"=>"Слънчева", "name:bg"=>"Слънчева", "name:en"=>"Slantcheva"
Hello look if its works for you , it is possible to generate a join from a emulated table from a select to group the values :
SELECT row_number() OVER() AS id ,COUNT(*) AS count , COALESCE(a.tags , '')||COALESCE(b.tags,'')||COALESCE(c.tags ,'') AS tagcombination,
CASE WHEN COALESCE(a.tags , '')||COALESCE(b.tags,'')||COALESCE(c.tags ,'')="name:en" THEN '+'
WHEN COALESCE(a.tags , '')||COALESCE(b.tags,'')||COALESCE(c.tags ,'') = 'name:en' THEN '+' END AS name
FROM public.ways AS a
LEFT JOIN (SELECT DISTINCT tags FROM public.ways WHERE tags = 'name' ) AS b ON a.tags = b.tags
LEFT JOIN (SELECT DISTINCT tags FROM public.ways WHERE tags IN('name:en', 'name:bg' ) ) AS c ON a.tags = c.tags
JOIN (SELECT generate_series )
GROUP BY tagcombination
--WHERE a.tags IS NOT NULL
--ORDER BY name
The name column could be translated into numbers from the tagscombination and even be ordered later if it fits better your relatory.
Need to do the test and use a predicate to filter if there is more values possibilities than you want to count in the table also.

In psql how to run a Loop for a Select query with CTEs and get the output shown if I run it in a read-only db?

My initial question is posted here (In psql how to run a Loop for a Select query with CTEs and get the output shown in read-only db?), which isn't defined well, so I am creating new question here.
I want to know how can I use a loop variable (or something similar) inside a Select query with CTEs .
I hope the following is a minimal reproducible example:
CREATE TABLE Persons (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);
insert into persons values (4,'Smith','Eric','713 Louise Circle','Paris');
insert into persons values (5,'Smith2','Eric2','715 Louise Circle','London');
insert into persons values (8,'Smith3','Eric3','718 Louise Circle','Madrid');
Now I run the following for different values of (1,2,3)
WITH params AS
(
SELECT <ROWNUMBER> AS rownumber ),
person AS
(
SELECT personid, lastname, firstname, address
FROM params, persons
ORDER BY personid DESC
LIMIT 1
OFFSET ( SELECT rownumber - 1
FROM params) ),
filtered AS
(
SELECT *
FROM person
WHERE address ~ (SELECT rownumber::text FROM params)
)
SELECT *
FROM filtered;
and getting these outputs respectively for 1,2 and 3:
| personid | lastname | firstname | address
|----------|----------|-----------|-------------------
| 8 | Smith3 | Eric3 | 718 Louise Circle
(1 row)
| personid | lastname | firstname | address
|----------|----------|-----------|---------
(0 rows)
| personid | lastname | firstname | address
|----------|----------|-----------|-------------------
| 4 | Smith | Eric | 713 Louise Circle
(1 row)
My goal is to have a single query with loop or any other means to get the union of all 3 above select runs. I only have read-only access to db, so can't output in a new table. The GUI software I use have options to output in an internal window or export to a plain text file. The desired result would be:
|personid | lastname | firstname | address
|----------|----------|-----------|-------------------
| 4 | Smith | Eric | 713 Louise Circle
| 8 | Smith3 | Eric3 | 718 Louise Circle
(2 rows)
In reality the the loop variable is used in a more complicated way.
If I decipher this right, you basically want to select all people where the row number according to the descending ID appears in the address. The final result should then be limited to certain of these row numbers.
Then you don't need to use that cumbersome LIMIT/OFFSET construct at all. You can simply use the row_number() window function.
To filter for the row numbers you can simply use IN. Depending on what you want here you can either use a list of literals, especially if the numbers aren't consecutive. Or you can use generate_series() to generate a list of consecutive numbers. Of course you can also use a subquery, when the numbers are stored in another table.
With a list of literals that would look something like this:
SELECT pn.personid,
pn.lastname,
pn.firstname,
pn.address,
pn.city
FROM (SELECT p.personid,
p.lastname,
p.firstname,
p.address,
p.city,
row_number() OVER (ORDER BY p.personid DESC) n
FROM persons p) pn
WHERE pn.address LIKE concat('%', pn.n, '%')
AND pn.n IN (1, 2, 4);
If you want to use generate_series() an example would be:
SELECT pn.personid,
pn.lastname,
pn.firstname,
pn.address,
pn.city
FROM (SELECT p.personid,
p.lastname,
p.firstname,
p.address,
p.city,
row_number() OVER (ORDER BY p.personid DESC) n
FROM persons p) pn
WHERE pn.address LIKE concat('%', pn.n, '%')
AND pn.n IN (SELECT s.n
FROM generate_series(1, 3) s (n));
And a subquery of another table could be used like so:
SELECT pn.personid,
pn.lastname,
pn.firstname,
pn.address,
pn.city
FROM (SELECT p.personid,
p.lastname,
p.firstname,
p.address,
p.city,
row_number() OVER (ORDER BY p.personid DESC) n
FROM persons p) pn
WHERE pn.address LIKE concat('%', pn.n, '%')
AND pn.n IN (SELECT t.nmuloc
FROM elbat t);
For larger sets of numbers you can also consider to use an INNER JOIN on the numbers instead of IN.
Using generate_series():
SELECT pn.personid,
pn.lastname,
pn.firstname,
pn.address,
pn.city
FROM (SELECT p.personid,
p.lastname,
p.firstname,
p.address,
p.city,
row_number() OVER (ORDER BY p.personid DESC) n
FROM persons p) pn
INNER JOIN generate_series(1, 1000000) s (n)
ON s.n = pn.n
WHERE pn.address LIKE concat('%', pn.n, '%');
Or when the numbers are in another table:
SELECT pn.personid,
pn.lastname,
pn.firstname,
pn.address,
pn.city
FROM (SELECT p.personid,
p.lastname,
p.firstname,
p.address,
p.city,
row_number() OVER (ORDER BY p.personid DESC) n
FROM persons p) pn
INNER JOIN elbat t
ON t.nmuloc = pn.n
WHERE pn.address LIKE concat('%', pn.n, '%');
Note that I also changed the regular expression pattern matching to a simple LIKE. That would make the queries a bit more portable. But you can of course replace that by any expression you really need.
db<>fiddle (with some of the variants)

Split a string in characters SQL

How can I split a string in characters and add a new line after each character in PostgreSQL
For example
num desc
1 Hello
2 Bye
num desc
1 H
e
l
l
o
2 B
y
e
select num, regexp_split_to_table(descr,'')
from the_table
order by num;
SQLFiddle: http://sqlfiddle.com/#!15/13c00/4
The order of the characters is however not guaranteed and achieving that is a bit complicated.
Building on Erwin's answer regarding this problem:
select case
when row_number() over (partition by id order by rn) = 1 then id
else null
end as id_display,
ch_arr[rn]
from (
select *,
generate_subscripts(ch_arr, 1) AS rn
from (
select id,
regexp_split_to_array(descr,'') as ch_arr
from data
) t1
) t2
order by id, rn;
Edit:
If you just want a single string for each id, where the characters are separated by a newline, you can use this:
select id,
array_to_string(regexp_split_to_array(descr,''), chr(10))
from data
order by id

TSQL invalid HAVING count

I am using SSMS 2008 and trying to use a HAVING statement. This should be a real simple query. However, I am only getting one record returned event though there are numerous duplicates.
Am I doing something wrong with the HAVING statement here? Or is there some other function that I could use instead?
select
address_desc,
people_id
from
dbo.address_view
where people_id is not NULL
group by people_id , address_desc
having count(*) > 1
sample data from address_view:
people_id address_desc
---------- ------------
Murfreesboro, TN 37130 F15D1135-9947-4F66-B778-00E43EC44B9E
11 Mohawk Rd., Burlington, MA 01803 C561918F-C2E9-4507-BD7C-00FB688D2D6E
Unknown, UN 00000 C561918F-C2E9-4507-BD7C-00FB688D2D6E
Jacksonville, NC 28546 FC7C78CD-8AEA-4C8E-B93D-010BF8E4176D
Memphis, TN 38133 8ED8C601-5D35-4EB7-9217-012905D6E9F1
44 Maverick St., Fitchburg, MA 8ED8C601-5D35-4EB7-9217-012905D6E9F1
The GROUP BY is going to lump your duplicates together into a single row.
I think instead, you want to find all people_id values with duplicate address_desc:
SELECT a.address_desc, a.people_id
FROM dbo.address_view a
INNER JOIN (SELECT address_desc
FROM dbo.address_view
GROUP BY address_desc
HAVING COUNT(*) > 1) t
ON a.address_desc = t.address_desc
using row_number and partition you can find the duplicate occurrences where row_num>1
select address_desc,
people_id,
row_num
from
(
select
address_desc,
people_id,
row_number() over (partition by address_desc order by address_desc) row_num
from
dbo.address_view
where people_id is not NULL
) x
where row_num>1

TSQL Group By with an "OR"?

This query for creating a list of Candidate duplicates is easy enough:
SELECT Count(*), Can_FName, Can_HPhone, Can_EMail
FROM Can
GROUP BY Can_FName, Can_HPhone, Can_EMail
HAVING Count(*) > 1
But if the actual rule I want to check against is FName and (HPhone OR Email) - how can I adjust the GROUP BY to work with this?
I'm fairly certain I'm going to end up with a UNION SELECT here (i.e. do FName, HPhone on one and FName, EMail on the other and combine the results) - but I'd love to know if anyone knows an easier way to do it.
Thank you in advance for any help.
Scott in Maine
Before I can advise anything, I need to know the answer to this question:
name phone email
John 555-00-00 john#example.com
John 555-00-01 john#example.com
John 555-00-01 john-other#example.com
What COUNT(*) you want for this data?
Update:
If you just want to know that a record has any duplicates, use this:
WITH q AS (
SELECT 1 AS id, 'John' AS name, '555-00-00' AS phone, 'john#example.com' AS email
UNION ALL
SELECT 2 AS id, 'John', '555-00-01', 'john#example.com'
UNION ALL
SELECT 3 AS id, 'John', '555-00-01', 'john-other#example.com'
UNION ALL
SELECT 4 AS id, 'James', '555-00-00', 'james#example.com'
UNION ALL
SELECT 5 AS id, 'James', '555-00-01', 'james-other#example.com'
)
SELECT *
FROM q qo
WHERE EXISTS
(
SELECT NULL
FROM q qi
WHERE qi.id <> qo.id
AND qi.name = qo.name
AND (qi.phone = qo.phone OR qi.email = qo.email)
)
It's more efficient, but doesn't tell you where the duplicate chain started.
This query select all entries along with the special field, chainid, that indicates where the duplicate chain started.
WITH q AS (
SELECT 1 AS id, 'John' AS name, '555-00-00' AS phone, 'john#example.com' AS email
UNION ALL
SELECT 2 AS id, 'John', '555-00-01', 'john#example.com'
UNION ALL
SELECT 3 AS id, 'John', '555-00-01', 'john-other#example.com'
UNION ALL
SELECT 4 AS id, 'James', '555-00-00', 'james#example.com'
UNION ALL
SELECT 5 AS id, 'James', '555-00-01', 'james-other#example.com'
),
dup AS (
SELECT id AS chainid, id, name, phone, email, 1 as d
FROM q
UNION ALL
SELECT chainid, qo.id, qo.name, qo.phone, qo.email, d + 1
FROM dup
JOIN q qo
ON qo.name = dup.name
AND (qo.phone = dup.phone OR qo.email = dup.email)
AND qo.id > dup.id
),
chains AS
(
SELECT *
FROM dup do
WHERE chainid NOT IN
(
SELECT id
FROM dup di
WHERE di.chainid < do.chainid
)
)
SELECT *
FROM chains
ORDER BY
chainid
None of these answers is correct. Quassnoi's is a decent approach, but you will notice one fatal flaw in the expressions "qo.id > dup.id" and "di.chainid < do.chainid": comparisons made by ID! This is ALWAYS bad practice because it depends on some inherent ordering in the IDs. IDs should NEVER be given any implicit meaning and should ONLY participate in equality or null testing. You can easily break Quassnoi's solution in this example by simply reordering the IDs in the data.
The essential problem is a disjunctive condition with a grouping, which leads to the possibility of two records being related through an intermediate, though they are not directly relatable.
e.g., you stated these records should all be grouped:
(1) John 555-00-00 john#example.com
(2) John 555-00-01 john#example.com
(3) John 555-00-01 john-other#example.com
You can see that #1 and #2 are relatable, as are #2 and #3, but clearly #1 and #3 are not directly relatable as a group.
This establishes that a recursive or iterative solution is the ONLY possible solution.
So, recursion is not viable since you can easily end up in a looping situation. This is what Quassnoi was trying to avoid with his ID comparisons, but in doing so he broke the algorithm. You could try limiting the levels of recursion, but you may not then complete all relations, and you will still potentially be following loops back upon yourself, leading to excessive data size and prohibitive inefficiency.
The best solution is ITERATIVE: Start a result set by tagging each ID as a unique group ID, and then spin through the result set and update it, combining IDs into the same unique group ID as they match on the disjunctive condition. Repeat the process on the updated set each time until no further updates can be made.
I will create example code for this soon.
GROUP BY doesn't support OR - it's implicitly AND and must include every non-aggregator in the select list.
I assume you also have a unique ID integer as the primary key on this table. If you don't, it's a good idea to have one, for this purpose and many others.
Find those duplicates by a self-join:
select
c1.ID
, c1.Can_FName
, c1.Can_HPhone
, c1.Can_Email
, c2.ID
, c2.Can_FName
, c2.Can_HPhone
, c2.Can_Email
from
(
select
min(ID),
Can_FName,
Can_HPhone,
Can_Email
from Can
group by
Can_FName,
Can_HPhone,
Can_Email
) c1
inner join Can c2 on c1.ID < c2.ID
where
c1.Can_FName = c2.Can_FName
and (c1.Can_HPhone = c2.Can_HPhone OR c1.Can_Email = c2.Can_Email)
order by
c1.ID
The query gives you N-1 rows for each N duplicate combinations - if you want just a count along with each unique combination, count the rows grouped by the "left" side:
select count(1) + 1,
, c1.Can_FName
, c1.Can_HPhone
, c1.Can_Email
from
(
select
min(ID),
Can_FName,
Can_HPhone,
Can_Email
from Can
group by
Can_FName,
Can_HPhone,
Can_Email
) c1
inner join Can c2 on c1.ID < c2.ID
where
c1.Can_FName = c2.Can_FName
and (c1.Can_HPhone = c2.Can_HPhone OR c1.Can_Email = c2.Can_Email)
group by
c1.Can_FName
, c1.Can_HPhone
, c1.Can_Email
Granted, this is more involved than a union - but I think it illustrates a good way of thinking about duplicates.
Project the desired transformation first from a derived table, then do the aggregation:
SELECT COUNT(*)
, CAN_FName
, Can_HPhoneOrEMail
FROM (
SELECT Can_FName
, ISNULL(Can_HPhone,'') + ISNULL(Can_EMail,'') AS Can_HPhoneOrEMail
FROM Can) AS Can_Transformed
GROUP BY Can_FName, Can_HPhoneOrEMail
HAVING Count(*) > 1
Adjust your 'OR' operation as needed in the derived table project list.
I know this answer will be criticised for the use of the temp table, but it will work anyway:
-- create temp table to give the table a unique key
create table #tmp(
ID int identity,
can_Fname varchar(200) null, -- real type and len here
can_HPhone varchar(200) null, -- real type and len here
can_Email varchar(200) null, -- real type and len here
)
-- just copy the rows where a duplicate fname exits
-- (better performance specially for a big table)
insert into #tmp
select can_fname,can_hphone,can_email
from Can
where can_fname exists in (select can_fname from Can
group by can_fname having count(*)>1)
-- select the rows that have the same fname and
-- at least the same phone or email
select can_Fname, can_Hphone, can_Email
from #tmp a where exists
(select * from #tmp b where
a.ID<>b.ID and A.can_fname = b.can_fname
and (isnull(a.can_HPhone,'')=isnull(b.can_HPhone,'')
or (isnull(a.can_email,'')=isnull(b.can_email,'') )
Try this:
SELECT Can_FName, COUNT(*)
FROM (
SELECT
rank() over(partition by Can_FName order by Can_FName,Can_HPhone) rnk_p,
rank() over(partition by Can_FName order by Can_FName,Can_EMail) rnk_m,
Can_FName
FROM Can
) X
WHERE rnk_p=1 or rnk_m =1
GROUP BY Can_FName
HAVING COUNT(*)>1