Sphinx MVA for float values - sphinx

I have a table with has many tables:
company
id | name
---------
1 | a
2 | b
3 | c
company_address
id | company_id | latitude | longitude |
-------------------------------------------------------
1 | 1|0.9684982117632206|1.1395506188216191|
2 | 1|0.7874478257111129|0.6816976456543681|
3 | 2|0.9758854923552117| 0.744264348306201|
4 | 2|0.7860300249465478|0.6804121583003967|
5 | 2|0.7874478257111129|0.6816976456543681|
6 | 3|0.9684982117632206|1.1395506188216191|
sql_attr_multi do not support float type field, sql_joined_field was removed. How can I solve this problem? Maybe there are solutions besides GROUP_CONCAT()?

I think your easiest would be to arrange your sphinx 'documents' to be addresses, not strictly company. i.e., the unique document-id is the address table id. company_address.id
Keep the company-id as an attribute so can group IF really only want one result per company.
sql_query = SELECT a.*, c.name FROM company_address a INNER JOIN company c ON (c.id = company_id)
sql_attr_uint = company_id
sql_attr_float = latitude
sql_attr_float = longitude
Then GEODIST will work on pretty much directly...
SELECT *, GEODIST(0.659298124, -2.136602399, latitude, longitude) AS distance
FROM addressIndex WHERE distance < 10000 GROUP BY company_id;
Or if you want each company address, don't include the GROUP BY.
The company name is still included as a field for full-text searches.

Related

Transforming information in postgresql

So, I have 2 tables,
In the 1st table, there is an Information of users
user_id | name
1 | Albert
2 | Anthony
and in the other table, I have information
where some users have address information where it can either be home, office or both home and office
user_id| address_type | address
1 | home | a
1 | office | b
2 | home | c
and the final result I want is this
user_id | name | home_address | office_address
1 | Albert | a | b
2 | Anthony | c | null
I have tried using left join and json_agg but the information that way is not readable,
any suggestions on how I can do this?
You can use two outer joins, one for the office address and one for the home address.
select t1.user_id, t1.name,
ha.address as home_address,
oa.address as office_address
from table1 t1
left join table2 ha on ha.user_id = t1.user_id and ha.address_type = 'home'
left join table2 oa on oa.user_id = t1.user_id and ha.address_type = 'office';
A solution using JSON could look like this
select t1.user_id, t1.name,
a.addresses ->> 'home' as home_address,
a.addresses ->> 'office' as office_address
from table1 t1
left join (
select user_id, jsonb_object_agg(address_type, address) as addresses
from table2
group by user_id
) a on a.user_id = t1.user_id;
Which might be a bit more flexible, because you don't need to add a new join for each address type. The first query is likely to be faster if you need to retrieve a large number of rows.

PostgreSQL: Selecting one address from almost but not exactly duplicate rows

I have a big table that I'm trying to join another table to, however the table has entries such as:
--- Name | Address | Priority
----------------------------------------
1 | Jane Doe | 123 Baker St | 1
2 | Jane Doe | 345 Clay Dr | 2
3 | Jeff Boe | 231 Street St| 1
4 | Karen Al | 4232 Elm St | 1
5 | Karen Al | 5632 Pine Ct | 2
What I really want to select is one single address per person. The correct address I want is priority 2. However some of the addresses don't have a priority 2, so I can't join only on priority 2.
I've tried the following test query:
SELECT DISTINCT n.ID, LastName, FirstName, MAX(Address), MAX(Address2), City, State, PostalCode, n.Phone
FROM NormalTable n
JOIN Contracts cn ON n.ID = cn.ID
Which returns the table that I sketched out above, with the same person/sameID but different addresses.
Is there a way to do this in one query? I can think of maybe doing one INSERT statement into my final table where I do all the priority 2 addresses and then ANOTHER INSERT statement for IDs that aren't in the table yet, and use the priority 1 address for those. But I'd much prefer if there's a way to do this all in one go where I end up with only the address I want.
You could choice the address you need joining a subquery for max priority
select m.LastName, m.FirstName, m.Address, m.Address2, m.City, m.State, m.PostalCode, m.Phone
from my_table m
inner join (
select LastName, FirstName, max(priority) max_priority
from my_table
group by LastName, FirstName
) t on t.LastName = m.LastName
AND t.FirstName = m.FirstName
AND t.max_priority = m.priority
I think you want something like this
SELECT DISTINCT (Name), Address, Priority
ORDER BY Priority DESC
How this works is that the DISTINCT (Name) only returns one row per name. The row returned for each Name is the first row. Which will be the one with the highest priority because of the ORDER BY.

Unpack expression results from case statement

Four categories in category table.
id | name
--------------
1 | 'wine'
2 | 'chocolate'
3 | 'autos'
4 | 'real estate'
Two of the many (thousands of) forecasters in forecaster table.
id | name
--------------
1 | 'sothebys'
2 | 'cramer'
Relevant forecasts by the forecasters for the categories in the forecast table.
| id | forecaster_id | category_id | forecast |
|----+---------------+-------------+--------------------------------------------------------------|
| 1 | 1 | 1 | 'bad weather, prices rise short-term' |
| 2 | 1 | 2 | 'cocoa bean surplus, prices drop' |
| 3 | 1 | 3 | 'we dont deal with autos - no idea' |
| 4 | 2 | 2 | 'sell, sell, sell' |
| 5 | 2 | 3 | 'demand for cocoa will skyrocket - prices up - buy, buy buy' |
I want prioritized mapping of (forecaster, category, forecast) such that, if a forecast exists for some primary forecaster (e.g. 'cramer') use it because I trust him more. If a forecast exists for some secondary forecaster (e.g. 'sothebys') use that. If no forecast exists for a category, return a row with that category and null for forecast.
I have something that almost works and after I get the logic down I hope to turn into parameterized query.
select
case when F1.category is not null
then (F1.forecaster, F1.category, F1.forecast)
when F2.category is not null
then (F2.forecaster, F2.category, F2.forecast)
else (null, C.category, null)
end
from
(
select
FR.name as forecaster,
C.id as cid,
C.category as category,
F.forecast
from
forecast F
inner join forecaster FR on (F.forecaster_id = FR.id)
inner join category C on (C.id = F.category_id)
where FR.name = 'cramer'
) F1
right join (
select
FR.name as forecaster,
C.id as cid,
C.category as category,
F.forecast
from
forecast F
inner join forecaster FR on (F.forecaster_id = FR.id)
inner join category C on (C.id = F.category_id)
where FR.name = 'sothebys'
) F2 on (F1.cid = F2.cid)
full outer join category C on (C.id = F2.cid);
This gives:
'(sothebys,wine,"bad weather, prices rise short-term")'
'(cramer,chocolate,"sell, sell, sell")'
'(cramer,autos,"demand for cocoa will skyrocket - prices up - buy, buy buy")'
'(,"real estate",)'
While that is the desired data it is a record of one column instead of three. The case was the only way I could find to achieve the ordering of cramer first sothebys next and there is lots of duplication. Is there a better way and how can the tuple like results be pulled back apart into columns?
Any suggestions, especially related to removal of duplication or general simplification appreciated.
This sounds like a case for DISTINCT ON (untested):
SELECT DISTINCT ON (c.id)
fr.name AS forecaster,
c.name AS category,
f.forecast
FROM forecast f
JOIN forecaster fr ON f.forecaster_id = fr.id
RIGHT JOIN category c ON f.category_id = c.id
ORDER BY
c.id,
CASE WHEN fr.name = 'cramer' THEN 0
WHEN fr.name = 'sothebys' THEN 1
ELSE 2
END;
For each category, the first row in the ordering will be picked. Since Cramer has a higher id than Sotheby's, it will be given preference.
Adapt the ORDER BY clause if you need a more complicated ranking.

How to use COUNT() in more that one column?

Let's say I have this 3 tables
Countries ProvOrStates MajorCities
-----+------------- -----+----------- -----+-------------
Id | CountryName Id | CId | Name Id | POSId | Name
-----+------------- -----+----------- -----+-------------
1 | USA 1 | 1 | NY 1 | 1 | NYC
How do you get something like
---------------------------------------------
CountryName | ProvinceOrState | MajorCities
| (Count) | (Count)
---------------------------------------------
USA | 50 | 200
---------------------------------------------
Canada | 10 | 57
So far, the way I see it:
Run the first SELECT COUNT (GROUP BY Countries.Id) on Countries JOIN ProvOrStates,
store the result in a table variable,
Run the second SELECT COUNT (GROUP BY Countries.Id) on ProvOrStates JOIN MajorCities,
Update the table variable based on the Countries.Id
Join the table variable with Countries table ON Countries.Id = Id of the table variable.
Is there a possibility to run just one query instead of multiple intermediary queries? I don't know if it's even feasible as I've tried with no luck.
Thanks for helping
Use sub query or derived tables and views
Basically If You You Have 3 Tables
select * from [TableOne] as T1
join
(
select T2.Column, T3.Column
from [TableTwo] as T2
join [TableThree] as T3
on T2.CondtionColumn = T3.CondtionColumn
) AS DerivedTable
on T1.DepName = DerivedTable.DepName
And when you are 100% percent sure it's working you can create a view that contains your three tables join and call it when ever you want
PS: in case of any identical column names or when you get this message
"The column 'ColumnName' was specified multiple times for 'Table'. "
You can use alias to solve this problem
This answer comes from #lotzInSpace.
SELECT ct.[CountryName], COUNT(DISTINCT p.[Id]), COUNT(DISTINCT c.[Id])
FROM dbo.[Countries] ct
LEFT JOIN dbo.[Provinces] p
ON ct.[Id] = p.[CountryId]
LEFT JOIN dbo.[Cities] c
ON p.[Id] = c.[ProvinceId]
GROUP BY ct.[CountryName]
It's working. I'm using LEFT JOIN instead of INNER JOIN because, if a country doesn't have provinces, or a province doesn't have cities, then that country or province doesn't display.
Thanks again #lotzInSpace.

PostgreSQL - Count Function

I've got two Tables Called Manu and Cars
Manufacturer | Employees | id
Toyota | 102346 | 1
Subaru | 284608 | 2
Kia | 268244 | 3
Suzuki | 228624 | 4
The second table Cars
Car | id
Corolla | 1
camry | 1
alto | 4
vitara | 4
forester | 2
impreza | 2
xv | 2
cerato | 3
celica | 1
Now the table Cars references back to table Manu through Id
Im trying to return manufacturers that have produced 2 or more models of cars.
So far what I have tried is
Select m.id, m.manufacturer
from Manu m
inner join Cars n on m.id = n.id
group by m.id having count(n.id) >= 2;
it tells me that column m.id must appear in group by clause or be used in an aggregate function. Very confused.
Thanks
First we'll get the number of cars by manufacturer:
SELECT id,COUNT(id) FROM cars GROUP BY id;
Next, we'll use that data to get the manufacturers that make more than two models:
SELECT s.id,s.count,m.manufacturer FROM
(SELECT id,COUNT(id) FROM cars GROUP BY id) s
JOIN manu m USING (id) WHERE s.count >= 2;
Actually you have already give the answer. just you didnt group by m.manufactorer;
Select m.id, m.manufactorer
from Manu m
inner join Cars n using(id)
group by m.id,m.manufactorer
having count(*) >= 2;