ERROR: column "table.column_name" must appear in the GROUP BY clause or be used in an aggregate function - postgresql

I have the following table:
SELECT * FROM trips_motion_xtics
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
| session_id | trip_id | lat_start | lat_end | lon_start | lon_end | alt | distance | segments_length | speed | acceleration | track | travel_mode |
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
| 652 | 303633 | 41.1523521 | 41.1524966 | -8.6097233 | -8.6096833 | 0 | 42.7424443438547 | 28.0353622436523 | 0 | 74.208 | 0 | foot |
| 652 | 303633 | 41.1523521 | 41.1524966 | -8.6097233 | -8.6096833 | 0 | 42.7424443438547 | 28.0353622436523 | 0 | 74.154 | 0 | foot |
| 652 | 303633 | 41.1523521 | 41.1524966 | -8.6097233 | -8.6096833 | 0 | 42.7424443438547 | 28.0353622436523 | 0 | 68.226 | 0 | foot |
| 656 | 303637 | 41.14454009 | 41.1631127 | -8.56292593 | -8.5870161 | 0 | 5921.07030809987 | 2785.6088546142 | 0 | 99.028 | 0 | car |
| 656 | 303637 | 41.14454009 | 41.1631127 | -8.56292593 | -8.5870161 | 0 | 5921.07030809987 | 2785.6088546142 | 0 | 109.992 | 0 | car |
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
Now would like to compute the average value for columns alt, distance, speed ... for unique value of session_id, trip_id, lat_start,...
Query:
SELECT DISTINCT(session_id, trip_id, lat_start, lat_end, lon_start, lon_end, travel_mode), AVG(alt) AS avg_alt, AVG(distance) AS avg_disntance, AVG(speed) AS avg_speed, AVG(acceleration) AS avg_acc FROM akil.trips_motion_xtics;
ERROR: column "trips_motion_xtics.session_id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT DISTINCT(session_id, trip_id, lat_start, lat_end, lon...
Required result:
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
| session_id | trip_id | lat_start | lat_end | lon_start | lon_end | alt | distance | segments_length | speed | acceleration | track | travel_mode |
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+
| 652 | 303633 | 41.1523521 | 41.1524966 | -8.6097233 | -8.6096833 | 0 | 42.7424443438547 | 28.0353622436523 | 0 | 72.196 | 0 | foot |
| 656 | 303637 | 41.14454009 | 41.1631127 | -8.56292593 | -8.5870161 | 0 | 5921.07030809987 | 2785.6088546142 | 0 | 104.51 | 0 | car |
+------------+---------+-------------+------------+-------------+------------+-----+------------------+------------------+-------+--------------+-------+-------------+

You want aggregation. You will get a unique record for each combination of the column listed in the GROUP BY clause, and you can apply aggregate functions (such as AVG()) on other columns:
SELECT
session_id,
trip_id,
lat_start,
lat_end,
lon_start,
lon_end,
travel_mode,
AVG(alt) AS avg_alt,
AVG(distance) AS avg_disntance,
AVG(speed) AS avg_speed,
AVG(acceleration) AS avg_acc
FROM akil.trips_motion_xtics
GROUP BY
session_id,
trip_id,
lat_start,
lat_end,
lon_start,
lon_end,
travel_mode

Related

tableau calculate cumulative value with condition

I have a tableau table with columns like this:
| ID | ww | count_flag |
| 1 | ww1 | 0 |
| 1 | ww2 | 1 |
| 1 | ww3 | 1 |
| 1 | ww4 | 0 |
| 1 | ww5 | 1 |
| 2 | ww1 | 1 |
| 2 | ww2 | 1 |
| 2 | ww3 | 1 |
| 2 | ww4 | 0 |
| 2 | ww5 | 1 |
...
Now I'd like to add a new column to show the consistent status for each ID among all the ww(workweek), the consistent status will be reset every time when the count_flag is 0 or ID changes, so it will look like below:
|ID | ww | count_flag | consistent status|
| 1 | ww1 | 0 | 0 |
| 1 | ww2 | 1 | 1 |
| 1 | ww3 | 1 | 2 |
| 1 | ww4 | 0 | 0 |
| 1 | ww5 | 1 | 1 |
| 2 | ww1 | 1 | 1 |
| 2 | ww2 | 1 | 2 |
| 2 | ww3 | 1 | 3 |
| 2 | ww4 | 0 | 0 |
| 2 | ww5 | 1 | 1 |
...
How should I create the calculating field to add such a parameter to the table column.

Is there any way to group by continious values

I have data from car gps tracker and I need to find parking and movement periods
Now every row have id of parking or movement in table
+------------+-------+----------+---------+------------+
| time | speed | ignition | trip_id | parking_id |
+------------+-------+----------+---------+------------+
| 05:48:50 | 0 | false | 0 | 300 |
| 05:49:20 | 0 | false | 0 | 300 |
| 05:49:20 | 10 | true | 300 | 0 |
| 05:50:01 | 20 | true | 300 | 0 |
| 05:51:20 | 17 | true | 300 | 0 |
| 05:51:20 | 0 | false | 0 | 301 |
| 05:52:40 | 0 | false | 0 | 301 |
| 05:52:40 | 12 | true | 301 | 0 |
| 05:52:50 | 22 | true | 301 | 0 |
| 05:53:00 | 30 | true | 301 | 0 |
| 05:53:30 | 40 | true | 301 | 0 |
| 05:53:30 | 0 | false | 0 | 302 |
| 05:55:00 | 0 | false | 0 | 302 |
+------------+-------+----------+---------+------------+
SELECT min(time) as time_start, max(time) as time_end, trip_id, parking_id
FROM 'tablename' GROUP BY trip_id, parking_id
result is
+------------+----------+---------+------------+
| time start | time end | trip id | parking id |
+------------+----------+---------+------------+
| 05:48:50 | 05:49:20 | 0 | 300|
| 05:49:20 | 05:51:20 | 300| 0 |
| 05:51:20 | 05:52:40 | 0 | 301|
| 05:52:40 | 05:53:30 | 301| 0 |
| 05:53:30 | 05:55:00 | 0 | 302|
+------------+----------+---------+------------+
How should I do group by so as not to use parking_id and trip_id. But generate parking ids and trip ids. And the final result must be:
+------------+----------+---------+------------+
| time start | time end | trip id | parking id |
+------------+----------+---------+------------+
| 05:48:50 | 05:49:20 | 0 | 1 |
| 05:49:20 | 05:51:20 | 1 | 0 |
| 05:51:20 | 05:52:40 | 0 | 2 |
| 05:52:40 | 05:53:30 | 2 | 0 |
| 05:53:30 | 05:55:00 | 0 | 3 |
+------------+----------+---------+------------+

PostgreSQL aggregate function for each row across multiple unknown number of columns

I looked through similar questions like this one, but they seem to have a definite number of columns. I would like to input a table that I do not know the number of columns.
Question:
How to calculate aggregate functions (e.g. avg() or sum() ) for each row across several columns if number of columns is not known in advance?
I have put the input table panel_stats_rnd csv and a DLL to create it here.
I would like to calculate for each row the rnd_avg_parcelcount as average of all columns c_1_avg_parcelcount, c_2_avg_parcelcount, ... where I can have input tables with any number (say 100) columns of _avg_parcelcount. And for columns rnd_sum_parcelcount I would like to calculate sum() of all columns that start with c_ and end with _sum_parcelcount.
The table looks like this:
SELECT * FROM panel_stats_rnd;
gid | d | dist_from | dist_to | distlabel | rnd_avg_parcelcount | rnd_sum_parcelcount | rnd_avg_callcount | rnd_sum_callcount | rnd_avg_perccalled | called_avg_parcelcount | called_sum_parcelcount | called_avg_callcount | called_sum_callcount | called_avg_perccalled | c_1_avg_parcelcount | c_1_sum_parcelcount | c_1_avg_callcount | c_1_sum_callcount | c_1_avg_perccalled | c_2_avg_parcelcount | c_2_sum_parcelcount | c_2_avg_callcount | c_2_sum_callcount | c_2_avg_perccalled
-----+----+-----------+---------+-----------+---------------------+---------------------+-------------------+-------------------+--------------------+------------------------+------------------------+----------------------+----------------------+-----------------------+---------------------+---------------------+-------------------+-------------------+----------------------+---------------------+---------------------+-------------------+-------------------+----------------------
1 | 0 | 0 | 100 | 0-100 | | | | | | 119045 | 119045 | 119045 | 23 | 0.000193204250493511 | 119045 | 119045 | 119045 | 16 | 0.000134402956865051 | 119045 | 119045 | 119045 | 16 | 0.000134402956865051
2 | 1 | 100 | 200 | 100-200 | | | | | | 163140 | 163140 | 163140 | 22 | 0.000134853500061297 | 163140 | 163140 | 163140 | 17 | 0.000104204977320093 | 163140 | 163140 | 163140 | 18 | 0.000110334681868334
3 | 2 | 200 | 300 | 200-300 | | | | | | 135934 | 135934 | 135934 | 10 | 7.3565112481057e-05 | 135934 | 135934 | 135934 | 18 | 0.000132417202465903 | 135934 | 135934 | 135934 | 15 | 0.000110347668721585
4 | 3 | 300 | 400 | 300-400 | | | | | | 116874 | 116874 | 116874 | 13 | 0.000111230898232284 | 116874 | 116874 | 116874 | 11 | 9.41184523503944e-05 | 116874 | 116874 | 116874 | 18 | 0.000154012012937009
5 | 4 | 400 | 500 | 400-500 | | | | | | 93216 | 93216 | 93216 | 12 | 0.000128733264675592 | 93216 | 93216 | 93216 | 10 | 0.000107277720562993 | 93216 | 93216 | 93216 | 12 | 0.000128733264675592
6 | 5 | 500 | 600 | 500-600 | | | | | | 69992 | 69992 | 69992 | 7 | 0.0001000114298777 | 69992 | 69992 | 69992 | 10 | 0.000142873471253858 | 69992 | 69992 | 69992 | 7 | 0.0001000114298777
7 | 6 | 600 | 700 | 600-700 | | | | | | 50816 | 50816 | 50816 | 10 | 0.000196788413098237 | 50816 | 50816 | 50816 | 6 | 0.000118073047858942 | 50816 | 50816 | 50816 | 0 | 0
8 | 7 | 700 | 800 | 700-800 | | | | | | 34814 | 34814 | 34814 | 0 | 0 | 34814 | 34814 | 34814 | 6 | 0.000172344459125639 | 34814 | 34814 | 34814 | 4 | 0.000114896306083759
9 | 8 | 800 | 900 | 800-900 | | | | | | 23023 | 23023 | 23023 | 1 | 4.34348260435217e-05 | 23023 | 23023 | 23023 | 4 | 0.000173739304174087 | 23023 | 23023 | 23023 | 1 | 4.34348260435217e-05
10 | 9 | 900 | 1000 | 900-1000 | | | | | | 14215 | 14215 | 14215 | 1 | 7.03482237073514e-05 | 14215 | 14215 | 14215 | 1 | 7.03482237073514e-05 | 14215 | 14215 | 14215 | 5 | 0.000351741118536757
11 | 10 | 1000 | 5000 | 1000-5000 | | | | | | 23527 | 23527 | 23527 | 0 | 0 | 23527 | 23527 | 23527 | 0 | 0 | 23527 | 23527 | 23527 | 3 | 0.000127513070089684
(11 rows)
I tried the following for 2 columns (works but I'd rather not write it 5 times for 100 columns, besides the number of columns has to be a parameter):
SELECT d,c_1_avg_parcelcount,c_2_avg_parcelcount,
(SELECT avg(c) FROM (VALUES (c_1_avg_parcelcount) , (c_2_avg_parcelcount) ) T (c)) AS Avg_,
(SELECT sum(c) FROM (VALUES (c_1_avg_parcelcount) , (c_2_avg_parcelcount) ) T (c)) AS sum_
FROM panel_stats_rnd;
I also tried the following but doesn't work.
WITH cols AS (
select value(column_name) from information_schema.columns
where table_name = 'panel_stats_rnd'
AND column_name SIMILAR TO 'c_%avg_parcelcount'
AND column_name != 'called_avg_parcelcount'
)
SELECT *, (SELECT avg(Col) FROM cols V(Col) ) AS col_average
FROM panel_stats_rnd;
I am almost there but something is missing...
select
*,
(select avg(v::numeric)
from json_each_text(row_to_json(panel_stats_rnd.*)) as j(k,v)
where k like 'c\_%\_avg\_parcelcount') as rnd_avg_parcelcount,
(select sum(v::numeric)
from json_each_text(row_to_json(panel_stats_rnd.*)) as j(k,v)
where k like 'c\_%\_sum\_parcelcount') as rnd_sum_parcelcount
from
panel_stats_rnd;
Look at the documentation about functions involved.
There are escapes for underlying characters (\_) because for like operator it is meaning any single character, for example select 'a' like '_'; is true.

Emacs Orgmode table $> References does not work

GNU Emacs 24.4.1 org-mode
Here is an org-mode table
#+TBLNAME: revenue
| / | < | | < | | < | | | | | | | | | | | |
| Product | Year_SUM | Month_SUM | Platform | Platform_SUM | adwo | AdMob | adChina | adSage | appfigures | appdriver | coco | Domob | Dianru | Limei | guohead | youmi |
| | | | | | | | | | | | | | | | | |
|---------+----------+-----------+----------+------------------+------+-------+---------+--------+------------+-----------+------+-------+--------+-------+---------+-------|
| Jan | | | iOS | #ERROR | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| | | | Android | =vsum($6..$>);NE | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 |
|---------+----------+-----------+----------+------------------+------+-------+---------+--------+------------+-----------+------+-------+--------+-------+---------+-------|
| | | | | | | | | | | | | | | | | |
#+TBLFM: $5=vsum($6..$>);NE
As you see ,the formula $5=vsum($6..$>);NE can't be calculated! Here is debug info:
Substitution history of formula
Orig: vsum($6..$>)
$xyz-> vsum($6..$>)
#r$c-> vsum($6..$>)
$1-> vsum((0)..$>)
--------^
Error: Expected `)'
But if I replace the formula with $5=vsum($6..$17) and then it works ,I can't figure out where is the problem?
I need some help ,appreciate it!

Total Found Count Miss-match while doing sorting V2.1 Sphinx

When i do sorting on some unixtimestamp date field the total_found count shows different result . Here is my Query
SELECT * FROM CA_SAC_persons,CA_KC_persons,CA_SFC_persons,CA_SJ_persons
WHERE MATCH('#fullname("^John$" | "^Joseph$" | "^Jose$" | "^Josh$" | "^Robs$")')
ORDER BY filing_date_ts DESC LIMIT 0,1;SHOW META;
Result :
+---------------+-------------+
| Variable_name | Value |
+---------------+-------------+
| total | 1000 |
| total_found | 4813 |
| time | 0.019 |
| docs[9] | 4603 |
| hits[9] | 5312 |
+---------------+-------------+
SELECT * FROM CA_SAC_persons,CA_KC_persons,CA_SFC_persons,CA_SJ_persons
WHERE MATCH('#fullname("^John$" | "^Joseph$" | "^Jose$" | "^Josh$" | "^Robs$")')
ORDER BY filing_date_ts ASC LIMIT 0,1;SHOW META;
Result :
+---------------+-------------+
| Variable_name | Value |
+---------------+-------------+
| total | 1000 |
| total_found | 4812 |
| time | 0.019 |
| docs[9] | 4603 |
| hits[9] | 5312 |
+---------------+-------------+
Why the total_found shows 1 record less in the 2nd Query ?