Converting Access Pivot Table to SQL Server - tsql

I'm having trouble converting a MS Access pivot table over to SQL Server. Was hoping someone might help..
TRANSFORM First(contacts.value) AS FirstOfvalue
SELECT contacts.contactid
FROM contacts RIGHT JOIN contactrecord ON contacts.[detailid] = contactrecord.[detailid]
GROUP BY contacts.contactid
PIVOT contactrecord.wellknownname
;
Edit: Responding to some of the comments
Contacts table has three fields
contactid | detailid | value |
1 1 Scott
contactrecord has something like
detailid | wellknownname
1 | FirstName
2 | Address1
3 | foobar
contractrecord is dyanamic in that the user at anytime can create a field to be added to contacts
the access query pulls out
contactid | FirstName | Address1 | foobar
1 | Scott | null | null
which is the pivot on the wellknownname. The key here is that the number of columns is dynamic since the user can, at anytime, create another field for the contact. Being new to pivot tables altogether, I'm wondering how I can recreate this access query in sql server.
As for transform... that's a built in access function. More information is found about it here. First() will just take the first result on that matching row.
I hope this helps and appreciate all the help.

I quick search for dynamic pivot tables comes up with this article.
After renaming things in his last query on the page I came up with this:
DECLARE #PivotColumnHeaders VARCHAR(max);
SELECT #PivotColumnHeaders = COALESCE(#PivotColumnHeaders + ',['+ CAST(wellknownname as varchar) + ']','['+ CAST(wellknownname as varchar) + ']')
FROM contactrecord;
DECLARE #PivotTableSQL NVARCHAR(max);
SET #PivotTableSQL = N'
SELECT *
FROM (
SELECT
c.contactid,
cr.wellknownname,
c.value
FROM contacts c
RIGHT JOIN contactrecord cr
on c.detailid = cr.detailid
) as pivotData
pivot(
min(value)
for wellknownname in (' + #PivotColumnHeaders +')
) as pivotTable
'
;
execute(#PivotTableSQL);
which despite its ugliness, it does the job

Related

Using a PostgreSQL function inside a loop

Suppose I have a PostgreSQL function that takes 2 parameters: id (INT), email (TEXT) and can be called like this:
SELECT * FROM my_function(101, 'myemail#gmail.com')
I want to run a SELECT query from a table that would return multiple id's:
SELECT id FROM mytable
| id |
--+------+
| 101 |
--+------+
| 102 |
--+------+
| 103 |
How would I loop through and plug each of the returned id's into my function in a query. FOr this example just assume the default email is alwasy "myemail#gmail.com"
I'm on mobile so I can't test it, but I think maybe this will work.
SELECT * FROM (select my_function(id, 'myemail#gmail.com') from mytable);
You can use a cross join:
SELECT *
FROM my_table mt
cross join lateral my_function(mt.id, 'myemail#gmail.com') as mf

Join and Concatenate rows from table into into string

I have 2 tables consider table named as fp and batch, I have to join 2 tables based on fp[primary key] of 1st table and fp_inst_id from 2nd table such that my output is :
First table all columns and 2nd table one column which is concatenated string of all the rows from join of table 1 and table 2 on fp.id and batch.fp_inst_id.
Note :
[there will be multiple fp_inst_id(of table 2) for unique ID(of table 1)]
Let me give you an example :
Created tables :
CREATE TABLE fp (
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);
CREATE TABLE batch (
batchID int,
fp_inst_id int,
xyz varchar(255),
abc varchar(255)
);
insert into fp values(1,'savan','nahar','abc','xyz');
insert into fp values(2,'mmm','asmd','aawd','12k3mn');
insert into batch values(1,1,'garbage1', 'abc1');
insert into batch values(2,1,'garbage2', 'abc2');
insert into batch values(3,1,'garbage3', 'abc3');
insert into batch values(4,2,'garbage9', 'abc9');
If i do normal join like this :
select * from fp join batch on fp.PersonID = batch.fp_inst_id;
What I want is :
Batch columns can be different like it's ok if it has some other delimiter of not surrounded by [] and separated on ';' or something.
What I have tried:
The same thing can be done using MYSQL using STUFF, FOR XML PATH
But it seems to be difficult in POSTGRES SQL as it doesn't support these things,
In POSTGRES SQL I tried string_agg, but it says me to group by everything
2nd thing I was trying was :
Using with clause first create the concatenated strings of table 2 group by on fp_inst_id, but in POSTGRES SQL, it allows group by on primary key(which is normal select) or it asks to use the aggregate function
I'm trying to do this in POSTGRES SQL through a query.
Thanks for the help in advance
Use array_agg to combine the batch rows and group-by to bracket the combination.
select personid,lastname,firstname,address,city,
array_agg(batch)
from fp
join batch on fp.PersonID = batch.fp_inst_id
group by personid,lastname,firstname,address,city;
eg:
jasen=# select personid,lastname,firstname,address,city,array_agg(batch) from fp join batch on fp.PersonID = batch.fp_inst_id group by 1,2,3,4,5;
personid | lastname | firstname | address | city | array_agg
----------+----------+-----------+---------+--------+---------------------------------------------------------------------
2 | mmm | asmd | aawd | 12k3mn | {"(4,2,garbage9,abc9)"}
1 | savan | nahar | abc | xyz | {"(1,1,garbage1,abc1)","(2,1,garbage2,abc2)","(3,1,garbage3,abc3)"}
here the batch column technically contains an array of tuples, but the sting representation seems acceptable.
Alternatively you can use concat_ws() to concat the values and then group by
select personid,lastname,firstname, address,city, array_agg(batch_columns) as batch_columns
from
(select fp.*, concat_ws(' / ',batch.batchid,batch.fp_inst_id, batch.xyz,batch.abc)::text as batch_columns
from fp
join batch
on fp.personid=batch.fp_inst_id)as table1
group by 1,2,3,4,5;
personid | lastname | firstname | address | city | batch_columns
----------+----------+-----------+---------+--------+---------------------------------------------------------------------------------
1 | savan | nahar | abc | xyz | {"1 / 1 / garbage1 / abc1","2 / 1 / garbage2 / abc2","3 / 1 / garbage3 / abc3"}
2 | mmm | asmd | aawd | 12k3mn | {"4 / 2 / garbage9 / abc9"}

How to remove bulk rows of duplicated chars from string in Postgres or SQLAlchemy?

I have a table with a column named "ids" , type of String. Could someone tell me how to remove the duplicated values in each of the rows?
Example, table is:
--------------------------------------------------
primary_key | ids
--------------------------------------------------
1 | {23,40,23}
--------------------------------------------------
2 | {78,40,13,78}
--------------------------------------------------
3 | {20,13,20}
--------------------------------------------------
4 | {7,2,7}
--------------------------------------------------
and I want to update it into:
--------------------------------------------------
primary_key | ids
--------------------------------------------------
1 | {23,40}
--------------------------------------------------
2 | {78,40,13}
--------------------------------------------------
3 | {20,13}
--------------------------------------------------
4 | {7,2}
--------------------------------------------------
In postgres I wrote:
UPDATE table_name
SET ids = (SELECT DISTINCT UNNEST(
(SELECT ids FROM table_name)::text[]))
In sqlalchemy I wrote:
session.query(table_name.ids).\
update({table_name.ids: func.unnest(table_name.ids,String).alias('data_view')},
synchronize_session=False)
None of these are working, so please help me, thanks in advance!
I think you could improve the design by storing these ids in another table one id per row with a foreign key referencing table_name.primary_key.
Also storing Array data as text strings seems strange.
Anyway, here is one way to do it: I wrapped the set returned by UNNEST with an inner subselect to be able to apply the aggregate_function needed to concatenate the strings again.
UPDATE table_name
SET ids = new_ids
FROM LATERAL (
SELECT primary_key, array_agg(elem)::text AS new_ids
FROM (SELECT DISTINCT primary_key, UNNEST(ids::text[]) as elem
FROM table_name ) t_inner
GROUP by primary_key )t_sub
WHERE t_sub.primary_key = table_name.primary_key

How to return a comma separated values of column without having to loop through the result set

Let say I have this 2 table
+----+---------+ +----+-----------+----------------+
| Id | Country | | Id | CountryId | City |
+----+---------+ +----+-----------+----------------+
| 1 | USA | | 1 | 1 | Washington, DC |
+----+---------+ +----+-----------+----------------+
| 2 | Canada | | 2 | 2 | Ottawa |
+----+---------+ +----+-----------+----------------+
| 3 | 1 | New York |
+----+-----------+----------------+
| 4 | 1 | Baltimore |
+----+-----------+----------------+
I need to produce a result like:
Id | Country | Cities
---+---------+--------------------------------------
1 | USA | Washington, DC, New York, Baltimore
---+------------------------------------------------
2 | Canada | Ottawa
So far, I am looping through the left side table result like this:
DECLARE #table
(
Id INT IDENTITY(1, 1),
CountryId INT,
City VARCHAR(50)
)
DECLARE #tableString
(
Id INT IDENTITY(1, 1),
CountryId INT,
Cities VARCHAR(100)
)
INSERT INTO #table
SELECT Id, City
FROM tblCountries
DECLARE #city VARCHAR(50)
DECLARE #id INT
DECLARE #count INT
DECLARE #i INT = 1
SELECT #count = COUNT(*) FROM #table
WHILE (#i <= #count)
BEGIN
SELECT #Id = Id, #city = City FROM #table WHERE Id = #i
IF(EXISTS(SELECT * FROM #tableString WHERE CountryId = #Id))
BEGIN
UPDATE #tableString SET Cities = Cities + ', ' + #city WHERE Id = #Id
END
ELSE
BEGIN
INSERT INTO #tableString (CountryId, city) VALUES (#Id, #city)
END
SET #i = #i + 1
END
SELECT tc.Id, tc.Country, ts.Cities
FROM tblCountries tc
LEFT JOIN #tableString ts
ON tc.Id = ts.CountryId
My concern is that with all those looping in TSQL, it may be a performance killer. Even with fewer, it appears to be slow. Is there a better way to concatenate those string without having to loop through the data set as if I was working in C#
.
Thanks for helping
This was answered many times, but I've got the feeling, that some explanation might help you...
... am I missing something? It seems like this is related to XML
The needed functionality STRING_AGG() was introduced with SQL-Server 2017. The other direction STRING_SPLIT() came with v2016.
But many people still use older versions (and will do this for years), so we need workarounds. There were approaches with loops, bad and slow... And you might use recursive CTEs. And - that's the point here! - we can use some abilities of XML to solve this.
Try this out:
DECLARE #xml XML=
N'<root>
<element>text1</element>
<element>text2</element>
<element>text3</element>
</root>';
--The query will return the first <element> below <root> and return text1.
SELECT #xml.value(N'(/root/element)[1]','nvarchar(max)');
--But now try this:
SELECT #xml.value(N'(/root)[1]','nvarchar(max)')
The result is text1text2text3.
The reason for this: If you call .value() on an element without a detailed specification of what you want to read, you'll get the whole element back. Find details here.
Now imagine an XML like this
DECLARE #xml2 XML=
N'<root>
<element>, text1</element>
<element>, text2</element>
<element>, text3</element>
</root>';
With the same query as above you'd get , text1, text2, text3. The only thing left is to cut off the leading comma and the space. This is done - in most examples - with STUFF().
So the challenge is to create this XML. And this is what you find in the linked examples.
A general example is this: Read all tables and list their columns as a CSV-list:
SELECT TOP 10
TABLE_NAME
,STUFF(
(SELECT ',' + c.COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS c
WHERE c.TABLE_SCHEMA=t.TABLE_SCHEMA AND c.TABLE_NAME=t.TABLE_NAME
ORDER BY c.COLUMN_NAME
FOR XML PATH('')
),1,1,'') AS AllTableColumns
FROM INFORMATION_SCHEMA.TABLES t

LIKE search of joined and concatenated records is really slow (PostgreSQL)

I'm returning a unique list of id's from the users table, where specific columns in a related table (positions) contain a matching string.
The related table may have multiple records for each user record.
The query is taking a really really long time (its not scaleable), so I'm wondering if I'm structuring the query wrong in some fundamental way?
Users Table:
id | name
-----------
1 | frank
2 | kim
3 | jane
Positions Table:
id | user_id | title | company | description
--------------------------------------------------
1 | 1 | manager | apple | 'Managed a team of...'
2 | 1 | assistant | apple | 'Assisted the...'
3 | 2 | developer | huawei | 'Build a feature that...'
For example: I want to return the user's id if a related positions record contains "apple" in either the title, company or description columns.
Query:
select
distinct on (users.id) users.id,
users.name,
...
from users
where (
select
string_agg(distinct users.description, ', ') ||
string_agg(distinct users.title, ', ') ||
string_agg(distinct users.company, ', ')
from positions
where positions.users_id::int = users.id
group by positions.users_id::int) like '%apple%'
UPDATE
I like the idea of moving this into a join clause. But what I'm looking to do is filter users conditional on below. And I'm not sure how to do both in a join.
1) finding the keyword in title, company, description
or
2) finding the keyword with full-text search in an associated string version of a document in another table.
select
to_tsvector(string_agg(distinct documents.content, ', '))
from documents
where users.id = documents.user_id
group by documents.user_id) ## to_tsquery('apple')
So I was originally thinking it might look like,
select
distinct on (users.id) users.id,
users.name,
...
from users
where (
(select
string_agg(distinct users.description, ', ') ||
string_agg(distinct users.title, ', ') ||
string_agg(distinct users.company, ', ')
from positions
where positions.users_id::int = users.id
group by positions.users_id::int) like '%apple%')
or
(select
to_tsvector(string_agg(distinct documents.content, ', '))
from documents
where users.id = documents.user_id
group by documents.user_id) ## to_tsquery('apple'))
But then it was really slow - I can confirm the slowness is from the first condition, not the full-text search.
Might not be the best solution, but a quick option is:
SELECT DISTINCT ON ( u.id ) u.id,
u.name
FROM users u
JOIN positions p ON (
p.user_id = u.id
AND ( description || title || company )
LIKE '%apple%'
);
Basically got rid of the subquery, unnecessary string_agg usage, grouping on position table etc.
What it does is doing conditional join and removing duplicate is covered by distinct on.
PS! I used table aliases u and p to shorten the example
EDIT: adding also WHERE example as requested
SELECT DISTINCT ON ( u.id ) u.id,
u.name
FROM users u
JOIN positions p ON ( p.user_id = u.id )
WHERE ( p.description || p.title || p.company ) LIKE '%apple%'
OR ...your other conditions...;
EDIT2: new details revealed setting new requirements of the original question. So adding new example for updated ask:
Since you doing lookups to 2 different tables (positions and uploads) with OR condition then simple JOIN wouldn't work.
But both lookups are verification type lookups - only looking does %apple% exists, then you do not need to aggregate and group by and convert the data.
Using EXISTS that returns TRUE for first match found is what you seem to need anyway. So removing all unnecessary part and using with LIMIT 1 to return positive value if first match found and NULL if not (latter will make EXISTS to become FALSE) will give you same result.
So here is how you could solve it:
SELECT DISTINCT ON ( u.id ) u.id,
u.name
FROM users u
WHERE EXISTS (
SELECT 1
FROM positions p
WHERE p.users_id = u.id::int
AND ( description || title || company ) LIKE '%apple%'
LIMIT 1
)
OR EXISTS (
SELECT 1
FROM uploads up
WHERE up.user_id = u.id::int -- you had here reference to table 'document', but it doesn't exists in your example query, so I just added relation to 'upoads' table as you have in FROM, assuming 'content' column exists there
AND up.content LIKE '%apple%'
LIMIT 1
);
NB! in your example queries have references to tables/aliases like documents which doesn't reflect anywhere in the FROM part. So either you have cut in your example real query with wrong naming or you have made other way typo is something you need to verify and adjust my example query accordingly.