Let's say my select statement in t-sql returns
John Smith | Chicago | a
John Smith | Chicago | f
John Smith | Chicago | j
Josh Dude | Houston | p
Josh Dude | Houston | s
And I want it to return
John Smith | Chicago | a, f, j
Josh Dude | Houston | p, s
How would I do that?
The first half of the article T-SQL: Normalized data to a single comma delineated string and back is a code and image-heavy tutorial on how you can use STUFF and a FOR XML PATH statement take a normalized set and turn one column into a single comma separated value.
Good luck.
Related
I have a table that looks like this:
id | name | address | code
-----------+--------------------------+--------------------+----------
101 | joe smith | 1 long road | SC1
102 | joe smith | 6 long road | SC1
103 | amy hughes | 5 hillside lane | SC5
104 | amy hughes | 5 hillside lane | SC5
I want to return the rows that are duplications based on name and code but have different address fields.
I had something like this originally (which looked for duplications across the name, address and code columns:
SELECT name, address, code, count(*)
FROM table_name
GROUP BY 1,2,3
HAVING count(*) >1;
Is there a way I can expand on the above to only return rows that have the same name and code but different address fields?
In my example data above, I would only want to return:
id | name | address | code
-----------+--------------------------+--------------------+----------
101 | joe smith | 1 long road | SC1
102 | joe smith | 6 long road | SC1
Remove address from the select list and GROUP BY and use count(DISTINCT):
SELECT name, code, count(DISTINCT address)
FROM table_name
GROUP BY name, code
HAVING count(DISTINCT address) > 1;
I have a table with columns like:
Name (Single Line Text) | Partner (Link to "Name")
----------------------- | --------------------------
John Doe | `Jane Doe`
Jane Doe | (Empty)
Bob Smith | `Mary Smith` `Kevin Smith`
Mary Smith | (Empty)
Kevin Smith | (Empty)
Alan Stephens | (Empty)
I'd like to create a new view for this table with filtering like:
WHERE {Partner (field)} is not empty
OR {Partner (column)} does not contain {Name}
such that the result would be:
Name (Single Line Text) | Partner (Link to "Name")
----------------------- | --------------------------
John Doe | `Jane Doe`
Bob Smith | `Mary Smith` `Kevin Smith`
Alan Stephens | (Empty)
The challenge here is that on line 1 of my filter, I'm looking for "records with no records linked in their Partner column", but on line 2 I'm looking for "records which are not found in the Partner column for any other record".
The overall goal is to generate a list of records with no Partner(s) plus records who are not anyone else's Partner. Is there a way to achieve this?
I discovered that a filter such as this one is outside the available feature set of Airtable. I was able to implement a workaround:
Add a Script block containing the script described in this video (script gist)
Use the Script block to populate a new column on the table (Partner Of)
Name (Single Line Text) | Partner (Link to "Name") | Partner Of (Link to "Name")
----------------------- | -------------------------- | ---------------------------
John Doe | `Jane Doe` | (Empty)
Jane Doe | (Empty) | `John Doe`
Bob Smith | `Mary Smith` `Kevin Smith` | (Empty)
Mary Smith | (Empty) | `Bob Smith`
Kevin Smith | (Empty) | `Bob Smith`
Alan Stephens | (Empty) | (Empty)
Create a new view with a filter like:
WHERE {Partner Of} is empty
I have a query that results in such a table:
guardian_id | child_id | guardian_name | relation | child_name |
------------|----------|---------------|----------|------------|
1 | 1 | John Doe | father | Doe Son |
2 | 1 | Jane Doe | mother | Doe Son |
3 | 2 | Peter Pan | father | Pan Dghter |
4 | 2 | Pet Pan | mother | Pan Dghter |
1 | 3 | John Doe | father | Doe Dghter |
2 | 3 | Jane Doe | mother | Doe Dghter |
So from these results, I need to count the families. That is, distinct children with the same guardians. From the results above, There are 3 children but 2 families. How can I achieve this?
If I do:
SELECT COUNT(DISTINCT child_id) as families FROM (
//larger query
)a
I'll get 3 which is not correct.
Alternatively, how can I incorporate a WHERE clause that checks DISTINCT guardian_id's? Any other approaches?
Also note that there are instances where a child may have one guardian only.
To get the distinct family you can try the following approach.
select distinct array_agg(distinct guardian_id)
from family
group by child_id;
The above query will return the list of unique families.
eg.
{1,2}
{3,4}
Now you can apply the count on top of it.
I am trying to translate the following SQL query into Tableau:
select store1.name, store1.city, store1.order_date
from store1
where order_date = (select max(store2.order_date) from store2
where store2.name = store1.name
and store2.city = store1.city)
I am quite new to Tableau and can't figure out how to translate the where clause as it is selecting from another table.
For example, given the following tables
Store 1:
Name | City | Order Date
Andrew | Boston | 23-Aug-16
Bob | Boston | 31-Jan-17
Cathy | Boston | 31-Jan-17
Cathy | San Diego | 19-Jan-17
Dan | New York | 3-Dec-16
Store 2:
Name | City | Order Date
Andrew | Boston | 2-Sep-16
Brandy | Miami | 4-Feb-17
Cathy | Boston | 31-Jan-17
Cathy | Boston | 2-Mar-16
Dan | New York | 2-Jul-16
My query would return the following from Store 1:
Name | City | Order Date
Bob | Boston | 31-Jan-17
Cathy | Boston | 31-Jan-17
Point for point, converting that SQL query into Tableau Custom SQL Query would be:
SELECT [Store1].[Name], [Store1].[City], [Store1].[Order Date]
FROM [Store1]
WHERE [Order Date] = (SELECT MAX([Store2].[Order Date]) FROM [Store2]
WHERE [Store2].[Name] = [Store1].[Name]
AND [Store2].[City] = [Store1].[City])
In the preview you will notice it will only return Cathy. But once you join the SQL Query onto your primary table on Order Date, you will see both Bob and Cathy as you expect.
I need to join two tables based on names. And the problem is that names may be a slight mispelling in one of the database. I have remedy this problem in the past using Stata and Python's fuzzy merging, where names are matched based on how closely similar they are, but I am wondering if this is possible to do in Postgresql.
For example, may data may be something similar to this:
Table A:
first_name_a | last_name_a | id_a
----------------------------------
William | Hartnell | 1
Matt | Smithe | 2
Paul | McGann | 3
David | Tennant | 4
Colin | Baker | 5
Table B:
first_name_b | last_name_b | id_b
----------------------------------
Matt | Smith | a
Peter | Davison | b
Dave | Tennant | c
Colin | Baker | d
Will | Hartnel | e
And in the end, I hope my results would look something like:
first_name_a | last_name_a | id_a | first_name_b | last_name_b | id_b
----------------------------------------------------------------------
William | Hartnell | 1 | Will | Hartnel | e
Matt | Smithe | 2 | Matt | Smith | a
Paul | McGann | 3 | | |
David | Tennant | 4 | Dave | Tennant | c
Colin | Baker | 5 | Colin | Baker | d
| | | Peter | Davison | b
My Sonic Screwdriver gives me some pseudo-code like this:
SELECT a.*, b.* FROM A a
JOIN B b
WHERE LEVENSHTEIN(first_name_a, first_name_b) IS LESS THAN 1
AND LEVENSHTEIN(last_name_a, last_name_b) IS LESS THAN 1
The DML you mention:
SELECT a.*, b.* FROM A a
JOIN B b
WHERE LEVENSHTEIN(first_name_a, first_name_b) IS LESS THAN 1
AND LEVENSHTEIN(last_name_a, last_name_b) IS LESS THAN 1
Looks correct, just bump up the 'fuzziness' (given 'IS LESS THAN 1' substitute 1 for the 'fuzzyness' level that you you require)
See http://www.postgresql.org/docs/9.1/static/fuzzystrmatch.html for reference info on LEVENSHTEIN.
Done up as an SQLFiddle. Play with the thresholds/look at some of the other mapping functions mentioned in matching fuzzy strings.