Error thrown when trying to chain a text search with a search on a join table that uses group_by - pg-search

I have three models, Story, Post and Tag. Story is in a one-to-many relationship with Post, and a many-to-many relationship with Tag.
I use the following scope to search for stories that have a least one tag that matches an array of tags:
scope :in_tags, lambda {|category_array, tag_array|
Story.joins('LEFT OUTER JOIN stories_tags ON stories_tags.story_id = stories.id').
joins('LEFT OUTER JOIN tags ON tags.id = stories_tags.tag_id').
where("tags.name in (:tag_array)", :tag_array => tag_array ).
group("stories.id")
}
Note that I use GROUP_BY so that the scope doesn't return a story multiple times if that story has multiple matching tags.
I use the following pg_search_scope to search for stories or posts with specified text:
pg_search_scope :with_text,
:against => :title,
:using => { :tsearch => { :dictionary => "english" }},
:associated_against => { :posts => :contents }
Both of these scopes work fine independent of each other. When I try to chain them together, however, I experience the postgreSQL error:
ActiveRecord::StatementInvalid (PG::Error: ERROR: column "pg_search_79dd68cf7f962ac568b3d7.pg_search_c5f43d73058486e1799d26" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...e"::text, '')) || to_tsvector('english', coalesce(pg_search_...
^
: SELECT "stories".*, ((ts_rank((to_tsvector('english', coalesce("stories"."title"::text, '')) || to_tsvector('english', coalesce(pg_search_79dd68cf7f962ac568b3d7.pg_search_c5f43d73058486e1799d26::text, ''))), (to_tsquery('english', ''' ' || 'track' || ' ''')), 0))) AS pg_search_rank FROM "stories" LEFT OUTER JOIN stories_tags ON stories_tags.story_id = stories.id LEFT OUTER JOIN tags ON tags.id = stories_tags.tag_id LEFT OUTER JOIN (SELECT "stories"."id" AS id, string_agg("posts"."contents"::text, ' ') AS pg_search_c5f43d73058486e1799d26 FROM "stories" INNER JOIN "posts" ON "posts"."story_id" = "stories"."id" GROUP BY "stories"."id") pg_search_79dd68cf7f962ac568b3d7 ON pg_search_79dd68cf7f962ac568b3d7.id = "stories"."id" WHERE (tags.name in (sports') AND (((to_tsvector('english', coalesce("stories"."title"::text, '')) || to_tsvector('english', coalesce(pg_search_79dd68cf7f962ac568b3d7.pg_search_c5f43d73058486e1799d26::text, ''))) ## (to_tsquery('english', ''' ' || 'track' || ' ''')))) GROUP BY stories.id ORDER BY latest_post_id DESC LIMIT 5000 OFFSET 0):
Is there any way to use pg_search with a scope that requires a join table with a GROUP_BY clause?

I ended up having to switch from the .group() method (which I couldn't seem to get working with pg_search) to select("DISTINCT(stories.id), stories.*). I also made use of ActiveRecords#preload so to eagerly load the associated tag records

Related

Spring Data JPA without entity class

We are using Spring Data JPA + Hibernate approach for implementation.
For complex queries with joins we have created a common mapping xml where we are defining those queries.
We are not creating each entity class for these complex queries.
How can we use JPARepository and execute these query?
Is there any way? Below is one of the query written in orm.xml :
SELECT FMT_NAME( pers.id ) AS customer_name, first_name, mid_name,
last_name,
addr.line_1_addr,
addr.line_2_addr,
RTRIM( LTRIM( addr.city_name || ', ' || addr.state_code || ' ' ||
addr.zip_code_num, ', ') || '-' || addr.zip_code_suffix, '-' ) AS line_3_addr
FROM pers, (SELECT pers_addr.pers_id ,
addr.line_1_addr, addr.line_2_addr,
addr.city_name, addr.state_code,
addr.zip_code_num, addr.zip_code_suffix
FROM pers_addr, addr WHERE addr.id = pers_addr.addr_id
AND TRUNC(SYSDATE) BETWEEN pers_addr.beg_date AND pers_addr.end_date
AND pers_addr.type_code = 'ML'
) addr WHERE pers.id = ? AND pers.id = addr.PERS_ID
You can make use select new the resource for your ordinary class. For example, I have the mapping classes Student, Team, School, Subject and want to get some data from these entities. And build an ordinary class StudentData
So I can do a query(#Query or #NamedQuery) like that:
select new com.my.package.vo.StudentData(student, teacher, subject) from Stutdent student inner join student.team inner join team.teacher ....

PostgreSql Group By and aggreate function error

My problem is, when I run the following query in MySQL, it looks like this
Query;
SELECT
CONCAT(b.tarih, '#', CONCAT(b.enlem, ',', b.boylam), '#', b.aldigi_yol) AS IlkMesaiEnlemBoylamImei,
CONCAT(tson.max_tarih, '#', CONCAT(tson.max_enlem, ',', tson.max_boylam), '#', tson.max_aldigi_yol) AS SonMesaiEnlemBoylamImei,
Max(CAST(b.hiz AS UNSIGNED)) As EnYuksekHiz,
TIME_FORMAT(Sec_TO_TIME(TIMESTAMPDIFF(SECOND, (b.tarih), (tson.max_tarih))), '%H:%i') AS DurmaSuresi
FROM
(Select id as max_id, tarih as max_tarih, enlem as max_enlem, boylam as max_boylam, aldigi_yol as max_aldigi_yol from _213gl2015016424 where id in(
SELECT MAX(id)
FROM _213gl2015016424 where (tarih between DATE('2016-11-30 05:45:00') AND Date('2017-01-13 14:19:06')) AND CAST(hiz AS UNSIGNED) > 0
GROUP BY DATE(tarih))
) tson
LEFT JOIN _213gl2015016424 a ON a.id = tson.max_id
LEFT JOIN _213gl2015016424 b ON DATE(b.tarih) = DATE(a.tarih)
WHERE b.tarih is not null And (b.tarih between DATE('2016-11-30 05:45:00') AND Date('2017-01-13 14:19:06')) AND b.hiz > 0
GROUP BY tson.max_tarih
Output is order by date;
Result query
When I try to run a query in PostgreSQL, I get group by mistake.
Query;
SELECT
CONCAT(b.tarih, '#', CONCAT(b.enlem, ',', b.boylam), '#', b.toplamyol) AS IlkMesaiEnlemBoylamImei,
CONCAT(tson.max_tarih, '#', CONCAT(tson.max_enlem, ',', tson.max_boylam), '#', tson.max_toplamyol) AS SonMesaiEnlemBoylamImei,
Max(CAST(b.hiz AS OID)) As EnYuksekHiz,
to_char(to_timestamp((extract(epoch from (tson.max_tarih)) - extract(epoch from (b.tarih)))) - interval '2 hour','HH24:MI') AS DurmaSuresi
FROM
(Select id as max_id, tarih as max_tarih, enlem as max_enlem, boylam as max_boylam, toplamyol as max_toplamyol from _213GL2016008691 where id in(
SELECT MAX(id)
FROM _213GL2016008691 where (tarih between DATE('2018-02-01 03:31:54') AND DATE('2018-03-01 03:31:54')) AND CAST(hiz AS OID) > 0
GROUP BY DATE(tarih))
) tson
LEFT JOIN _213GL2016008691 a ON a.id = tson.max_id
LEFT JOIN _213GL2016008691 b ON DATE(b.tarih) = DATE(a.tarih)
WHERE b.tarih is not null And (b.tarih between DATE('2018-02-12 03:31:54') AND DATE('2018-02-13 03:31:54')) AND b.hiz > 0
GROUP BY tson.max_tarih
Group by error is : To use the aggregate function, you must add the column "b.tarih" to the GROUP BY list.
When I add it I get the same error for another column.I'm waiting for your help.
You are using a feature of MySQL that is not standard SQL and you can also deactivate.
You are grouping by tson.max_tarih in your query. That means that for all rows that share the same value in that field, you will get only one row as a result of that group.
If you have several different values in the rest of the fields (enlem, boylam, etc...) which one are you trying to get in as the result of the query? That's the question that PostgreSQL is asking you.
MySQL is just returning any value for those fields among the rows in the group. PostgreSQL requires you to actually specify it.
Two typical solutions would be grouping by the rest of the fields (b.tarih, b.enlem) or specifying the value those fields to something like MAX(b.tarih), etc.

Additional conditions in JOIN

I have tables with articles and users, both have many-to-many mapping to third table - reads.
What I am trying to do here is to get all unread articles for particular user ( user_id not present in table reads ).
My query is getting all articles but those read are marked, which if fine as I can filter them out (user_id field contains id of user in question).
I have an SQL query like this:
SELECT articles.id, reads.user_id
FROM articles
LEFT JOIN
reads
ON articles.id = reads.article_id AND reads.user_id = 9
ORDER BY articles.last_update DESC LIMIT 5;
Which yields following:
articles.id | reads.user_id
-------------------+-----------------
57125839 | 9
57065456 |
56945065 |
56945066 |
56763090 |
(5 rows)
This is fine. This is what I want.
I'd like to get same result in Catalyst using my article model, but I cannot find any option to add conditions to a JOIN clause.
Do you know any way how to add AND X = Y to DBIx JOIN?
I know this can be done with custom resoult source and virtual view, but I have some other queries that could benefit from it and I'd like to avoid creating virtual view for each of them.
Thanks,
Canto
I don't even know what Catalyst is but I can hack the SQL query:
select articles.id, reads.user_id
from
articles
left join
(
select *
from reads
where user_id = 9
) reads on articles.id = reads.article_id
order by articles.last_update desc
limit 5;
I got an solution.
It's not straight forward, but it's better than virtual view.
http://search.cpan.org/dist/DBIx-Class/lib/DBIx/Class/Relationship/Base.pm#condition
Above describes how to use conditions in JOIN clause.
However, my case needs an variable in those conditions, which is not available by default in model.
So getting around a bit of model concept and introducing variable to it, we have the following.
In model file
our $USER_ID;
__PACKAGE__->has_many(
pindols => "My::MyDB::Result::Read",
sub {
my $args = shift;
die "no user_id specified!" unless $USER_ID;
return ({
"$args->{self_alias}.id" => { -ident => "$args->{foreign_alias}.article_id" },
"$args->{foreign_alias}.user_id" => { -ident => $USER_ID },
});
}
);
in controller
$My::MyDB::Result::Article::USER_ID = $c->user->id;
$articles = $channel->search(
{ "pindols.user_id" => undef } ,
{
page => int($page),
rows => 20,
order_by => 'last_update DESC',
prefetch => "pindols"
}
);
Will fetch all unread articles and yield following SQL.
SELECT me.id, me.url, me.title, me.content, me.last_update, me.author, me.thumbnail, pindols.article_id, pindols.user_id FROM (SELECT me.id, me.url, me.title, me.content, me.last_update, me.author, me.thumbnail FROM articles me LEFT JOIN reads pindols ON ( me.id = pindols.article_id AND pindols.user_id = 9 ) WHERE ( pindols.user_id IS NULL ) GROUP BY me.id, me.url, me.title, me.content, me.last_update, me.author, me.thumbnail ORDER BY last_update DESC LIMIT ?) me LEFT JOIN reads pindols ON ( me.id = pindols.article_id AND pindols.user_id = 9 ) WHERE ( pindols.user_id IS NULL ) ORDER BY last_update DESC: '20'
Of course you can skip the paging but I had it in my code so I included it here.
Special thanks goes to deg from #dbix-class on irc.perl.org and https://blog.afoolishmanifesto.com/posts/dbix-class-parameterized-relationships/.
Thanks,
Canto

PGSQL - Joining two tables on complicated condition

I got stuck during database migration on PostgreSQL and need your help.
I have two tables that I need to join: drzewa_mateczne.migracja (data I need to migrate) and ibl_as.t_adres_lesny (dictionary I need to join with migracja).
I need to join them on replace(drzewa_mateczne.migracja.adresy_lesne, ' ', '') = replace(ibl_as.t_adres_lesny.adres, ' ', ''). However my data is not very regular, so I want to join it on first good match with the dictionary.
I've created the following query:
select
count(*)
from
drzewa_mateczne.migracja a
where
length(a.adresy_lesne) > 0
and replace(a.adresy_lesne, ' ', '') = (select substr(replace(al.adres, ' ', ''), 1, length(replace(a.adresy_lesne, ' ', ''))) from ibl_as.t_adres_lesny al limit 1)
The query doesn't return any rows.
It does successfully join empty rows if ran without
length(a.adresy_lesne) > 0
The two following queries return rows (as expected):
select replace(adres, ' ', '')
from ibl_as.t_adres_lesny
where substr(replace(adres, ' ', ''), 1, 16) = '16-15-1-13-180-c'
limit 1
select replace(adresy_lesne, ' ', ''), length(replace(adresy_lesne, ' ', ''))
from drzewa_mateczne.migracja
where replace(adresy_lesne, ' ', '') = '16-15-1-13-180-c'
I'm suspecting that there might be a problem in sub-query inside the 'where' clause in my query. If you guys could help me resolve this issue, or at least point me in the right direction, I'd be very greatful.
Thanks in advance,
Jan
You can largely simplify to:
SELECT count(*)
FROM drzewa_mateczne.migracja a
WHERE a.adresy_lesne <> ''
AND EXISTS (
SELECT 1 FROM ibl_as.t_adres_lesny al
WHERE replace(al.adres, ' ', '')
LIKE (replace(a.adresy_lesne, ' ', '') || '%')
)
a.adresy_lesne <> '' does the same as length(a.adresy_lesne) > 0, just faster.
Replace the correlated subquery with an EXISTS semi-join (to get only one match per row).
Replace the complex string construction with a simple LIKE expression.
More information on pattern matching and index support in these related answers:
PostgreSQL LIKE query performance variations
Difference between LIKE and ~ in Postgres
speeding up wildcard text lookups
What you're basically telling the database to do is to get you the count of rows from drzewa_mateczne.migracja that have a non-empty adresy_lesne field that is a prefix of the adres field of a semi-random ibl_as.t_adres_lesny row...
Lose the "limit 1" in the subquery and substitute the "=" with "in" and see if that is what you wanted...

Need some SQL Stored proc help - joining

So I have this section of my proc:
SELECT
com_contact.rc_name_full as CreatedBy,
capComponent.cm_strike as CapStrike,
floorComponent.cm_strike as FloorStrike,
tq_nominal_notional as Notional,
maxComponent.cm_effective_dt as EffectiveDate,
maxComponent.cm_maturity_dt as MaturityDate,
CAST(CAST(DATEDIFF(mm,maxComponent.cm_effective_dt,maxComponent.cm_maturity_dt) as decimal(9,2))/12 as decimal(9,2)) as term,
(
CASE WHEN se_amort_term_mnth IS NOT NULL THEN se_amort_term_mnth / 12
ELSE CAST(CAST(DATEDIFF(mm,
ISNULL(cmam_amortization_start_dt, maxComponent.cm_effective_dt),
cmam_amortization_end_dt) as decimal(9,2))/12 as decimal(9,2))
END
) AS AmortTermYears,
tq_dd_product as Product,
dh_key_rate as KeyRate,
dh_pv01 as PV01,
dh_val_time_stamp as RateTimeStamp,
re_bnk_le.re_company_name as Company,
rc_contact_id as UserId,
stp_name as NickName,
'' as project,
'' as Borrower,
'' as Lender,
'' as AdditionalInfo,
CASE WHEN tpm_pd_permission_id = 85 THEN 'LLH' WHEN tpm_pd_permission_id = 86 THEN 'ALM' ELSE '' END as Permission,
tr_transaction_id as TransactionId,
NULL as IndicationId
FROM cfo_transaction
The line that says '' as project, we have to actually change to return data now.
The table that next to the FROM, called cfo_transaction has an id on it called tr_transaction_id. We have another table called com_project_transaction_link, that links those id's with project id's, using two two columns called:
pt_tr_transaction_id and pt_pj_project_id, and then we have a table containing all the projects called com_project that has a pj_project_id and a pj_project_name.
GOAL: return the pj_project_name from that projects table where it links with the transactions being pulled.
I really don't know how to do this.
Thanks!
Try this:
SELECT
com_contact.rc_name_full as CreatedBy,
capComponent.cm_strike as CapStrike,
floorComponent.cm_strike as FloorStrike,
tq_nominal_notional as Notional,
maxComponent.cm_effective_dt as EffectiveDate,
maxComponent.cm_maturity_dt as MaturityDate,
CAST(CAST(DATEDIFF(mm,maxComponent.cm_effective_dt,maxComponent.cm_maturity_dt) as decimal(9,2))/12 as decimal(9,2)) as term,
(
CASE WHEN se_amort_term_mnth IS NOT NULL THEN se_amort_term_mnth / 12
ELSE CAST(CAST(DATEDIFF(mm,
ISNULL(cmam_amortization_start_dt, maxComponent.cm_effective_dt),
cmam_amortization_end_dt) as decimal(9,2))/12 as decimal(9,2))
END
) AS AmortTermYears,
tq_dd_product as Product,
dh_key_rate as KeyRate,
dh_pv01 as PV01,
dh_val_time_stamp as RateTimeStamp,
re_bnk_le.re_company_name as Company,
rc_contact_id as UserId,
stp_name as NickName,
PR.pj_project_name as project,
'' as Borrower,
'' as Lender,
'' as AdditionalInfo,
CASE WHEN tpm_pd_permission_id = 85 THEN 'LLH' WHEN tpm_pd_permission_id = 86 THEN 'ALM' ELSE '' END as Permission,
tr_transaction_id as TransactionId,
NULL as IndicationId
FROM cfo_transaction TR
INNER JOIN com_project_transaction_link TL
ON TR.tr_transaction_id = TL.pt_tr_transaction_id
INNER JOIN com_project PR
ON TL.pt_pj_project_id = PR.pj_project_id
The query above assumes that every transaction and project is on the table that joins your tables of projects and transactions (thus the INNER JOIN), but yo can change those to LEFT JOIN if you want
You just add a second join to the other table to the query.
select
yourfields,
p.pj_project_name as project
FROM cfo_transaction t
join com_project_transaction_link tl
on t.tr_transaction_id = tl.pt_tr_transaction_id
join com_project p
on tl.pt_pj_project_id = p.pj_project_id
SELECT ..., cp.pj_project_name
FROM cfo_transaction ct
INNER JOIN com_project_transaction_link cptl
ON ct.tr_transaction_id = cptl.pt_tr_transaction_id
INNER JOIN com_project cp
ON cptl.pt_pj_project_id = cp.pj_project_id