Total Found Count Miss-match while doing sorting V2.1 Sphinx - sphinx

When i do sorting on some unixtimestamp date field the total_found count shows different result . Here is my Query
SELECT * FROM CA_SAC_persons,CA_KC_persons,CA_SFC_persons,CA_SJ_persons
WHERE MATCH('#fullname("^John$" | "^Joseph$" | "^Jose$" | "^Josh$" | "^Robs$")')
ORDER BY filing_date_ts DESC LIMIT 0,1;SHOW META;
Result :
+---------------+-------------+
| Variable_name | Value |
+---------------+-------------+
| total | 1000 |
| total_found | 4813 |
| time | 0.019 |
| docs[9] | 4603 |
| hits[9] | 5312 |
+---------------+-------------+
SELECT * FROM CA_SAC_persons,CA_KC_persons,CA_SFC_persons,CA_SJ_persons
WHERE MATCH('#fullname("^John$" | "^Joseph$" | "^Jose$" | "^Josh$" | "^Robs$")')
ORDER BY filing_date_ts ASC LIMIT 0,1;SHOW META;
Result :
+---------------+-------------+
| Variable_name | Value |
+---------------+-------------+
| total | 1000 |
| total_found | 4812 |
| time | 0.019 |
| docs[9] | 4603 |
| hits[9] | 5312 |
+---------------+-------------+
Why the total_found shows 1 record less in the 2nd Query ?

Related

How to convert row into column in PostgreSQL of below table

I was trying to convert the trace table to resulted table in postgress. I have hug data in the table.
I have table with name : Trace
entity_id | ts | key | bool_v | dbl_v | str_v | long_v |
---------------------------------------------------------------------------------------------------------------
1ea815c48c5ac30bca403a1010b09f1 | 1593934026155 | temperature | | | | 45 |
1ea815c48c5ac30bca403a1010b09f1 | 1593934026155 | operation | | | Normal | |
1ea815c48c5ac30bca403a1010b09f1 | 1593934026155 | period | | | | 6968 |
1ea815c48c5ac30bca403a1010b09f1 | 1593933202984 | temperature | | | | 44 |
1ea815c48c5ac30bca403a1010b09f1 | 1593933202984 | operation | | | Reverse | |
1ea815c48c5ac30bca403a1010b09f1 | 1593933202984 | period | | | | 3535 |
Trace Table
convert the above table into following table in PostgreSQL
Output Table: Result
entity_id | ts | temperature | operation | period |
----------------------------------------------------------------------------------------|
1ea815c48c5ac30bca403a1010b09f1 | 1593934026155 | 45 | Normal | 6968 |
1ea815c48c5ac30bca403a1010b09f1 | 1593933202984 | 44 | Reverse | 3535 |
Result Table
Have you tried this yet?
select entity_id, ts,
max(long_v) filter (where key = 'temperature') as temperature,
max(str_v) filter (where key = 'operation') as operation,
max(long_v) filter (where key = 'period') as period
from trace
group by entity_id, ts;

Computing the median of all users in all trips

Got my hand dirty on GPS trajectory dataset. This data set consists of sequence of the GPS points of trips for users, until the length of trip:
SELECT * FROM gps_track;
+---------+------------------+------------------+
| user_id | lat | lon |
+---------+------------------+------------------+
| 1 | 39.984702 | 116.318417 |
| 1 | 39.984683 | 116.31845 |
| 1 | 39.984611 | 116.318026 |
| . | . | . |
| 2 | 26.162202 | 119.943787 |
| 2 | 26.161528 | 119.943234 |
| 2 | 26.1619 | 119.943228 |
| . | . | . |
| 3 | 22.8143366666667 | 108.332281666667 |
| 3 | 22.81429 | 108.332256666667 |
| 3 | 22.81432 | 108.332258333333 |
| . | . | . |
| 4 | 32.9239666666667 | 117.386683333333 |
| 4 | 32.9235166666667 | 117.386616666667 |
| 4 | 32.9232833333333 | 117.386683333333 |
| . | . | . |
+---------+------------------+------------------+
I can get the COUNT of GPS points for each user_id 1, 2,3,.. etc.
SELECT distinct user_id
, COUNT(lat) AS lat_count
FROM gps_track
GROUP BY user_id
How do I then get the median of the number of GPS points in all the trips? Not the median point for each user. Here's the fiddle for sample points from my dataset.
Maybe:
SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY lat_count)
FROM (SELECT user_id
, COUNT(lat) AS lat_count
FROM gps_track
GROUP BY user_id) du;

ERROR: column "table.column_name" must appear in the GROUP BY clause or be used in an aggregate function

I have the following table:
SELECT * FROM trips_motion_xtics
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
| session_id | trip_id | lat_start | lat_end | lon_start | lon_end | alt | distance | segments_length | speed | acceleration | track | travel_mode |
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
| 652 | 303633 | 41.1523521 | 41.1524966 | -8.6097233 | -8.6096833 | 0 | 42.7424443438547 | 28.0353622436523 | 0 | 74.208 | 0 | foot |
| 652 | 303633 | 41.1523521 | 41.1524966 | -8.6097233 | -8.6096833 | 0 | 42.7424443438547 | 28.0353622436523 | 0 | 74.154 | 0 | foot |
| 652 | 303633 | 41.1523521 | 41.1524966 | -8.6097233 | -8.6096833 | 0 | 42.7424443438547 | 28.0353622436523 | 0 | 68.226 | 0 | foot |
| 656 | 303637 | 41.14454009 | 41.1631127 | -8.56292593 | -8.5870161 | 0 | 5921.07030809987 | 2785.6088546142 | 0 | 99.028 | 0 | car |
| 656 | 303637 | 41.14454009 | 41.1631127 | -8.56292593 | -8.5870161 | 0 | 5921.07030809987 | 2785.6088546142 | 0 | 109.992 | 0 | car |
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
Now would like to compute the average value for columns alt, distance, speed ... for unique value of session_id, trip_id, lat_start,...
Query:
SELECT DISTINCT(session_id, trip_id, lat_start, lat_end, lon_start, lon_end, travel_mode), AVG(alt) AS avg_alt, AVG(distance) AS avg_disntance, AVG(speed) AS avg_speed, AVG(acceleration) AS avg_acc FROM akil.trips_motion_xtics;
ERROR: column "trips_motion_xtics.session_id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT DISTINCT(session_id, trip_id, lat_start, lat_end, lon...
Required result:
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
| session_id | trip_id | lat_start | lat_end | lon_start | lon_end | alt | distance | segments_length | speed | acceleration | track | travel_mode |
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
| 652 | 303633 | 41.1523521 | 41.1524966 | -8.6097233 | -8.6096833 | 0 | 42.7424443438547 | 28.0353622436523 | 0 | 72.196 | 0 | foot |
| 656 | 303637 | 41.14454009 | 41.1631127 | -8.56292593 | -8.5870161 | 0 | 5921.07030809987 | 2785.6088546142 | 0 | 104.51 | 0 | car |
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
You want aggregation. You will get a unique record for each combination of the column listed in the GROUP BY clause, and you can apply aggregate functions (such as AVG()) on other columns:
SELECT
session_id,
trip_id,
lat_start,
lat_end,
lon_start,
lon_end,
travel_mode,
AVG(alt) AS avg_alt,
AVG(distance) AS avg_disntance,
AVG(speed) AS avg_speed,
AVG(acceleration) AS avg_acc
FROM akil.trips_motion_xtics
GROUP BY
session_id,
trip_id,
lat_start,
lat_end,
lon_start,
lon_end,
travel_mode

Create Calculated Pivot from Several Query Results in PostgreSQL

I have question regarding how to make a calculated pivot table from several query results on PostgreSQL. I've managed to make three queries results but don't have any idea how to combine and calculate all the data into a single table. I've tried to google it but found out that most of the question is about how to make a pivot table from a single table, which I'm able to do using sum, case, and group by. Well, Here's the simplified version of my query results
Query from query 1 which contains gross value
| city | code | gross |
|-------|------|--------|
| city1 | 21 | 194793 |
| city1 | 25 | 139241 |
| city1 | 28 | 231365 |
| city2 | 21 | 282025 |
| city2 | 25 | 334458 |
| city2 | 28 | 410852 |
| city3 | 21 | 109237 |
Result from query 2 which contains positive adjustments
| city | code | adj_pos |
|-------|------|---------|
| city1 | 21 | 16259 |
| city1 | 25 | 13634 |
| city1 | 28 | 45854 |
| city2 | 25 | 18060 |
| city2 | 28 | 18220 |
Result from query 3 which contains negative adjustments
| city | code | adj_neg |
|-------|------|---------|
| city1 | 25 | 23364 |
| city2 | 21 | 27478 |
| city2 | 25 | 23474 |
And what I want to to is to create something like this
| city | 21_gross | 25_gross | 28_gross | 21_pos | 25_pos | 28_pos | 21_neg | 25_neg | 28_neg |
|-------|----------|----------|----------|--------|--------|--------|--------|--------|--------|
| city1 | 194793 | 139241 | 231365 | 16259 | 13634 | 45854 | | 23364 | |
| city2 | 282025 | 334458 | 410852 | | 18060 | 18220 | 27478 | 23474 | |
| city3 | 109237 | | | | | | | | |
or probably final calculation which come from gross + positive adjustment -
negative adjustment from each city on each code like this
| city | 21_nett | 25_nett | 28_nett |
|-------|---------|---------|---------|
| city1 | 211052 | 129511 | 277219 |
| city2 | 254547 | 329044 | 429072 |
| city3 | 109237 | 0 | 0 |
Any suggestion will be appreciated. Thank you!
I think the best you can achieve is to get the pivoting part as JSON - http://sqlfiddle.com/#!17/b7d64/23:
select
city,
json_object_agg(
code,
coalesce(gross,0) + coalesce(adj_pos,0) - coalesce(adj_neg,0)
) as js
from q1
left join q2 using (city,code)
left join q3 using (city,code)
group by city

Crosstab function and Dates PostgreSQL

I had to create a cross tab table from a Query where dates will be changed into column names. These order dates can be increase or decrease as per the dates passed in the query. The order date is in Unix format which is changed into normal format.
Query is following:
Select cd.cust_id
, od.order_id
, od.order_size
, (TIMESTAMP 'epoch' + od.order_date * INTERVAL '1 second')::Date As order_date
From consumer_details cd,
consumer_order od,
Where cd.cust_id = od.cust_id
And od.order_date Between 1469212200 And 1469212600
Order By od.order_id, od.order_date
Table as follows:
cust_id | order_id | order_size | order_date
-----------|----------------|---------------|--------------
210721008 | 0437756 | 4323 | 2016-07-22
210721008 | 0437756 | 4586 | 2016-09-24
210721019 | 10749881 | 0 | 2016-07-28
210721019 | 10749881 | 0 | 2016-07-28
210721033 | 13639 | 2286145 | 2016-09-06
210721033 | 13639 | 2300040 | 2016-10-03
Result will be:
cust_id | order_id | 2016-07-22 | 2016-09-24 | 2016-07-28 | 2016-09-06 | 2016-10-03
-----------|----------------|---------------|---------------|---------------|---------------|---------------
210721008 | 0437756 | 4323 | 4586 | | |
210721019 | 10749881 | | | 0 | |
210721033 | 13639 | | | | 2286145 | 2300040