How to do text search in postgresql with multiple results - postgresql

Let's say I get the following table when I do
select name, alternative_name from persons;
name | alternative_name
--------------------------+----------------------------------
Johnny A | John the first
Johnny B | The second John
Now with this query
select name from persons where to_tsvector(name || alternative_name) ## to_tsquery('John');:
name | alternative_name
--------------------------+----------------------------------
Johnny A | John the first
Shouldn't I get both? How can I do a full text search on both the name and columns where I get all rows that match the search query?
Edit: Yes, there is indeed a typo here. It is to_tsquery

you concat without space:
t=# with c(n,a) as (values('Johnny A','John the first'),('Johny B','The second John'))
select * from c
where to_tsvector(n || a) ## to_tsquery('John')
;
n | a
---------+-----------------
Johny B | The second John
(1 row)
so first haystack becomes Johnny AJohn the first, thus lexeme do not match, try:
t=# with c(n,a) as (values('Johnny A','John the first'),('Johny B','The second John'))
select * from c
where to_tsvector(n ||' '|| a) ## to_tsquery('John')
;
n | a
----------+-----------------
Johnny A | John the first
Johny B | The second John
(2 rows)

Related

tsql - How to convert multiples rows and columns into one row

id | acct_num | name | orderdt
1 1006A Joe Doe 1/1/2021
2 1006A Joe Doe 1/5/2021
EXPECTED OUTPUT
id | acct_num | name | orderdt | id1 | acct_num1 | NAME1 | orderdt1
1 1006A Joe Doe 1/1/2021 2 1006A Joe Doe 1/5/2021
My query is the following:
Select id,
acct_num,
name,
orderdt
from order_tbl
where acct_num = '1006A'
and orderdt >= '1/1/2021'
If you always have one or two rows you could do it like this (I'm assuming the latest version of SQL Server because you said TSQL):
NOTE: If you have a known max (eg 4) this solution can be converted to support any number by changing the modulus and adding more columns and another join.
WITH order_table_numbered as
(
SELECT ID, ACCT_NUM, NAME, ORDERDT,
ROW_NUMBER() AS (PARTITION BY ACCT_NUM ORDER BY ORDERDT) as RN
)
SELECT first.id as id, first.acct_num as acct_num, first.num as num, first.order_dt as orderdt,
second.id as id1, second.acct_num as acct_num1, second.num as num1, second.order_dt as orderdt1
FROM order_table_numbered first
LEFT JOIN order_table_numbered second ON first.ACCT_NUM = second.ACCT_NUM and (second.RN % 2 = 0)
WHERE first.RN % 2 = 1
If you have an unknown number of rows I think you should solve this on the client OR convert the groups to XML -- the XML support in SQL Server is not bad.

How to find which posts have the highest comments and which posts have the fewest comments?

I am very new to postgreSQl and SQL and databases, I hope you guys can help me with this, i want to know which posts have the most amount of comments and which have the least amount of comments and the users need to be specified too.
CREATE SCHEMA perf_demo;
SET search_path TO perf_demo;
-- Tables
CREATE TABLE users(
id SERIAL -- PRIMARY KEY
, email VARCHAR(40) NOT NULL UNIQUE
);
CREATE TABLE posts(
id SERIAL -- PRIMARY KEY
, user_id INTEGER NOT NULL -- REFERENCES users(id)
, title VARCHAR(100) NOT NULL UNIQUE
);
CREATE TABLE comments(
id SERIAL -- PRIMARY KEY
, user_id INTEGER NOT NULL -- REFERENCES users(id)
, post_id INTEGER NOT NULL -- REFERENCES posts(id)
, body VARCHAR(500) NOT NULL
);
-- Generate approx. N users
-- Note: NULL values might lead to lesser rows than N value.
INSERT INTO users(email)
WITH query AS (
SELECT 'user_' || seq || '#'
|| ( CASE (random() * 5)::INT
WHEN 0 THEN 'my'
WHEN 1 THEN 'your'
WHEN 2 THEN 'his'
WHEN 3 THEN 'her'
WHEN 4 THEN 'our'
END )
|| '.mail' AS email
FROM generate_series(1, 5) seq -- Important: Replace N with a useful value
)
SELECT email
FROM query
WHERE email IS NOT NULL;
-- Generate N posts
INSERT INTO posts(user_id, title)
WITH expanded AS (
SELECT random(), seq, u.id AS user_id
FROM generate_series(1, 8) seq, users u -- Important: Replace N with a useful value
),
shuffled AS (
SELECT e.*
FROM expanded e
INNER JOIN (
SELECT ei.seq, min(ei.random) FROM expanded ei GROUP BY ei.seq
) em ON (e.seq = em.seq AND e.random = em.min)
ORDER BY e.seq
)
-- Top 20 programming languages: https://www.tiobe.com/tiobe-index/
SELECT s.user_id,
'Let''s talk about (' || s.seq || ') '
|| ( CASE (random() * 19 + 1)::INT
WHEN 1 THEN 'C'
WHEN 2 THEN 'Python'
WHEN 3 THEN 'Java'
WHEN 4 THEN 'C++'
WHEN 5 THEN 'C#'
WHEN 6 THEN 'Visual Basic'
WHEN 7 THEN 'JavaScript'
WHEN 8 THEN 'Assembly language'
WHEN 9 THEN 'PHP'
WHEN 10 THEN 'SQL'
WHEN 11 THEN 'Ruby'
WHEN 12 THEN 'Classic Visual Basic'
WHEN 13 THEN 'R'
WHEN 14 THEN 'Groovy'
WHEN 15 THEN 'MATLAB'
WHEN 16 THEN 'Go'
WHEN 17 THEN 'Delphi/Object Pascal'
WHEN 18 THEN 'Swift'
WHEN 19 THEN 'Perl'
WHEN 20 THEN 'Fortran'
END ) AS title
FROM shuffled s;
-- Generate N comments
-- Note: The cross-join is a performance killer.
-- Try the SELECT without INSERT with small N values to get an estimation of the execution time.
-- With these values you can extrapolate the execution time for a bigger N value.
INSERT INTO comments(user_id, post_id, body)
WITH expanded AS (
SELECT random(), seq, u.id AS user_id, p.id AS post_id
FROM generate_series(1, 10) seq, users u, posts p -- Important: Replace N with a useful value
),
shuffled AS (
SELECT e.*
FROM expanded e
INNER JOIN ( SELECT ei.seq, min(ei.random) FROM expanded ei GROUP BY ei.seq ) em ON (e.seq = em.seq AND e.random = em.min)
ORDER BY e.seq
)
SELECT s.user_id, s.post_id, 'Here some comment: ' || md5(random()::text) AS body
FROM shuffled s;
Could someone show me how this could be done please, I am new to SQL/postgres any help would be much appreciated. an Example would be very helpful too.
Good effort in pasting the whole dataset creation procedure, is what it needs to be included in order to make the example reproducible.
Let's start first with, how to join several tables: you have your posts table which contains the user_id and we can use it to join with users with the following.
SELECT email,
users.id user_id,
posts.id post_id,
title
from posts join users
on posts.user_id=users.id;
This will list the posts together with the authors. Check the joining condition (after the ON) stating the fields we're using. The result should be similar to the below
email | user_id | post_id | title
------------------+---------+---------+----------------------------------------
user_1#her.mail | 1 | 5 | Let's talk about (5) Visual Basic
user_1#her.mail | 1 | 2 | Let's talk about (2) Assembly language
user_3#her.mail | 3 | 8 | Let's talk about (8) R
user_3#her.mail | 3 | 7 | Let's talk about (7) Perl
user_4#her.mail | 4 | 6 | Let's talk about (6) Visual Basic
user_5#your.mail | 5 | 4 | Let's talk about (4) R
user_5#your.mail | 5 | 3 | Let's talk about (3) C
user_5#your.mail | 5 | 1 | Let's talk about (1) Ruby
(8 rows)
Now it's time to join this result, with the comments table. Since a post can have comments or not and you want to show all posts even if you don't have any comments you should use the LEFT OUTER JOIN (more info about join types here
So let's rewrite the above to include comments
SELECT email,
users.id user_id,
posts.id post_id,
title,
comments.body
from posts
join users
on posts.user_id=users.id
left outer join comments
on posts.id = comments.post_id
;
Check out the join between posts and comments based on post_id.
The result of the query is the list of posts, related author and comments, similar to the below
email | user_id | post_id | title | body
------------------+---------+---------+----------------------------------------+-----------------------------------------------------
user_1#her.mail | 1 | 5 | Let's talk about (5) Visual Basic |
user_1#her.mail | 1 | 2 | Let's talk about (2) Assembly language |
user_3#her.mail | 3 | 8 | Let's talk about (8) R | Here some comment: 200bb07acfbac893aed60e018b47b92b
user_3#her.mail | 3 | 8 | Let's talk about (8) R | Here some comment: 66159adaed11404b1c88ca23b6a689ef
user_3#her.mail | 3 | 8 | Let's talk about (8) R | Here some comment: e5cc1f7c10bb6103053bf281d3cadb60
user_3#her.mail | 3 | 8 | Let's talk about (8) R | Here some comment: 5ae8674c2ef819af0b1a93398efd9418
user_3#her.mail | 3 | 7 | Let's talk about (7) Perl | Here some comment: 5b818da691c1570dcf732ed8f6b718b3
user_3#her.mail | 3 | 7 | Let's talk about (7) Perl | Here some comment: 88a990e9495841f8ed628cdce576a766
user_4#her.mail | 4 | 6 | Let's talk about (6) Visual Basic |
user_5#your.mail | 5 | 4 | Let's talk about (4) R | Here some comment: ed19bb476eb220d6618e224a0ac2910d
user_5#your.mail | 5 | 3 | Let's talk about (3) C | Here some comment: 23cd43836a44aeba47ad212985f210a7
user_5#your.mail | 5 | 1 | Let's talk about (1) Ruby | Here some comment: b83999120bd2bb09d71aa0c6c83a05dd
user_5#your.mail | 5 | 1 | Let's talk about (1) Ruby | Here some comment: b4895f4e0aa0e0106b5d3834af80275e
(13 rows)
Now you can start aggregating and counting comments for a certain post. You can use PG's aggregation functions, we'll use the COUNT here.
SELECT email,
users.id user_id,
posts.id post_id,
title,
count(comments.id) nr_comments
from posts
join users
on posts.user_id=users.id
left outer join comments
on posts.id = comments.post_id
group by email,
users.id,
posts.id,
title
;
Check out that we're counting the comments.id field but we could also perform a count(*) which just counts the rows. Also check that we are grouping our results by email, users.id, post.id and title, the columns we are showing alongside the count.
The result should be similar to
email | user_id | post_id | title | nr_comments
------------------+---------+---------+----------------------------------------+-------------
user_3#her.mail | 3 | 7 | Let's talk about (7) Perl | 2
user_5#your.mail | 5 | 3 | Let's talk about (3) C | 1
user_5#your.mail | 5 | 1 | Let's talk about (1) Ruby | 2
user_3#her.mail | 3 | 8 | Let's talk about (8) R | 4
user_1#her.mail | 1 | 5 | Let's talk about (5) Visual Basic | 0
user_5#your.mail | 5 | 4 | Let's talk about (4) R | 1
user_4#her.mail | 4 | 6 | Let's talk about (6) Visual Basic | 0
user_1#her.mail | 1 | 2 | Let's talk about (2) Assembly language | 0
(8 rows)
This should be the result you're looking for. Just bear in mind, that you're showing the user from users who wrote the post, not the one who commented. To view who commented you'll need to change the joining conditions.

Update all fields in a column from subquery postgresql

I have one table, say buchas containing a list of products. I had to create another table buchas_type with the type (class if you will) of the products in buchas.
Now I have to reference the class of product in buchas, by adding a type_id column.
So, my problem can be divided in two parts:
I need a subquery to "guess" the product type by its name. That should be somewhat easy, since the name of the product contains somewhere the name of its type. Example:
-----------------Name----------------- | ---Type---
BUCHA GOB 1600 GOB
The problem is I have a type GOB(2) that will mess things up with the type GOB. (I have also other look alike types raising the same problem).
So far, I got this:
SELECT buchas_type.id
FROM buchas_type
JOIN buchas ON buchas.description LIKE '%' || buchas_type.type || '%'
ORDER BY length(type) desc LIMIT 1;
That will solve the problem with the look alike types, since it returns the longest match. However, I need a WHERE clause otherwise I get always the same buchas_type_id. If I try for instance Where buchas.id = 50 I get a correct result.
The second part of the problem is the UPDATE command itself. I need to fill up the recently created buchas_type_id from the subquery I showed in (1). So, for every row in buchas, it has to search for the type in buchas_type using the current row's description as a parameter.
How to achieve that?
Assuming you have a primary key on buchas, the first part of your problem can be solved with distinct on.
select * from buchas;
buchas_id | description | buchas_type_id
-----------+-------------------+----------------
1 | BUCHA GOB 1600 |
2 | BUCHA GOB(2) 1700 |
(2 rows)
select * from buchas_type;
buchas_type_id | type
----------------+--------
1 | GOB
2 | GOB(2)
(2 rows)
select distinct on (b.buchas_id) b.buchas_id, t.buchas_type_id, b.description, t.type
from buchas b
join buchas_type t
on b.description ilike '%'||t.type||'%'
order by b.buchas_id, length(t.type) desc;
buchas_id | buchas_type_id | description | type
-----------+----------------+-------------------+--------
1 | 1 | BUCHA GOB 1600 | GOB
2 | 2 | BUCHA GOB(2) 1700 | GOB(2)
(2 rows)
With this solved, you can do an update...from:
with type_map as (
select distinct on (b.buchas_id) b.buchas_id, t.buchas_type_id
from buchas b
join buchas_type t
on b.description ilike '%'||t.type||'%'
order by b.buchas_id, length(t.type) desc
)
update buchas
set buchas_type_id = t.buchas_type_id
from type_map t
where t.buchas_id = buchas.buchas_id;
UPDATE 2
select * from buchas b join buchas_type t on t.buchas_type_id = b.buchas_type_id;
buchas_id | description | buchas_type_id | buchas_type_id | type
-----------+-------------------+----------------+----------------+--------
1 | BUCHA GOB 1600 | 1 | 1 | GOB
2 | BUCHA GOB(2) 1700 | 2 | 2 | GOB(2)
(2 rows)

Postgres : Get multiple columns with group by

Table
select * from hello;
id | name
----+------
1 | abc
2 | xyz
3 | abc
4 | dfg
5 | abc
(5 rows)
Query
select name,count(*) from hello where name in ('abc', 'dfg') group by name;
name | count
------+-------
dfg | 1
abc | 3
(2 rows)
In the above query, I am trying to get the count of the rows whose name is in the tuple. However, I want to get the id as well with the count of the names. Is there a way this can be achievable? Thanks
If you want to return the "id" values, then you can use a window function:
select id, name, count(*) over(PARTITION BY name)
from hello
where name in ('abc', 'dfg');
This will return the id values along with the count of rows per name.
If you want to see all IDs for each name, you need to aggregate them:
select name, count(*), array_agg(id) as ids
from hello
where name in ('abc', 'dfg')
group by name;
This returns something like this:
name | count | ids
-----+-------+--------
abc | 3 | {1,3,5}
dfg | 1 | {4}

Dynamic "INSERT INTO" Query with User Defined Type

I have got a SQL-Table were eache line consists a singel Value of some kind of Virtuel-Tabel - means the real existig SQL-Table looks like this:
-----------------------------------------
|DataRecordset | DataField | DataValue |
-----------------------------------------
| 1 | Firstname | John |
| 1 | Lastname | Smith |
| 1 | Birthday | 18.12.1963 |
| 2 | Firstname | Jane |
| 2 | Lastname | Smith |
| 2 | Birthday | 14.06.1975 |
-----------------------------------------
and I need to get something that feels like this:
-------------------------------------
| Firstname | Lastname | Birthday |
-------------------------------------
| John | Smith | 18.12.1963 |
| Jane | Smith | 14.06.1975 |
-------------------------------------
the reason why the real existing SQL-Table is stored like the first one is, that there are a lot more information around the core-data... like who write the data... when was the data written... from which to which time was the data significant... so there are a lot of diffrent variabels which decides which line from the first table i use to generate the second one.
I created a User-Defined-Tabletype on the SQL-Server which looks like the second table.
Then i start writing a procedure...
DECLARE #secondTable secondTable_Typ
DECLARE firstTable_Cursor CURSOR FOR SELECT DataRecordset, ... WHERE...lot of Text
OPEN firstTable_Cursor
FETCH NEXT FROM firstTable_Cursor
INTO #DataRecordset, #...
WHILE ##FETCH_STATUS = 0
BEGIN
IF NOT EXISTS(SELECT * FROM #secondTable WHERE DataRecordset= #DataRecordset)
BEGIN
the Problem i have... now i need some kind of dynamic Query, because i want do something like this:
INSERT INTO #secondTable (DataRecordset, #DataField ) VALUES (#DataRecordset, #DataValue)
but i cant use the variable #DataField like this... so i used google and found the function sp_executesql... i wrote the following code:
SET #sqlString = 'INSERT INTO #xsecondTable (DataRecordset, ' + #DataField + ') VALUES (#xDataRecordset, #xDataValue)'
EXEC sp_executesql #sqlString, N'#xsecondTable secondTable_Typ, #xDataRecordset smallint, #xDataValue sql_variant', #secondTable , #DataRecordset, #DataValue
but when i run the procedure i got an error that means i have to add a parameter "READONLY" to "#xsecondTable"...
i think the problem is, that sp_executesql can use variables as input or as outup... but i am not shure if its possiple to get this user defined table type into this procedure...
someone any idea how to get this code to run?
thank you very much
Have you considered doing a PIVOT on the data? Something along the lines of:
SELECT
[Firstname]
, [Lastname]
, [Birthday]
FROM
(
SELECT
[DataRecordset]
, [DataField]
, [DataValue]
FROM [Table]
) DATA
PIVOT
(
MIN ([DataValue]) FOR [DataField] IN
(
[Firstname]
, [Lastname]
, [Birthday]
)
) PVT