I have an index where some data's has duplicate, all fields are similar except for latitude,longitude and id (field id is not realy ID, just generated row_number() OVER () AS id).
it's example:
mysql> select id,vacancy_id,prof_area_ids,latitude,longitude from jobVacancy;
+------+------------+---------------+----------+-----------+
| id | vacancy_id | prof_area_ids | latitude | longitude |
+------+------------+---------------+----------+-----------+
| 1 | 917 | 11,199,202 | 0.973178 | 0.743566 |
| 2 | 916 | 17,283,288 | 0.973178 | 0.743566 |
| 3 | 915 | 17,288 | 0.973178 | 0.743566 |
| 4 | 914 | 30,482 | 0.973178 | 0.743566 |
| 5 | 919 | 15,243 | 0.825153 | 0.692837 |
| 6 | 919 | 15,243 | 0.825162 | 0.692828 |
| 7 | 918 | 8,154 | 0.825153 | 0.692837 |
| 8 | 918 | 8,154 | 0.825162 | 0.692828 |
| 9 | 920 | 17,283,288 | 0.958914 | 1.282161 |
| 10 | 920 | 17,283,288 | 0.958915 | 1.282215 |
| 11 | 924 | 12,208 | 0.97333 | 0.658246 |
| 12 | 924 | 12,208 | 0.973336 | 0.658237 |
| 13 | 923 | 21,365 | 0.97333 | 0.658246 |
| 14 | 923 | 21,365 | 0.973336 | 0.658237 |
| 15 | 922 | 20,359 | 0.97333 | 0.658246 |
| 16 | 922 | 20,359 | 0.973336 | 0.658237 |
| 17 | 921 | 19,346 | 0.97333 | 0.658246 |
| 18 | 921 | 19,346 | 0.973336 | 0.658237 |
| 19 | 926 | 12,17,208,292 | 0.88396 | 2.389868 |
| 20 | 925 | 12,208 | 0.88396 | 2.389868 |
+------+------------+---------------+----------+-----------+
20 rows in set (0.00 sec)
Now I want to group data by vacancy_id
mysql> select id,vacancy_id,prof_area_ids,latitude,longitude from jobVacancy group by vacancy_id;
+------+------------+---------------+----------+-----------+
| id | vacancy_id | prof_area_ids | latitude | longitude |
+------+------------+---------------+----------+-----------+
| 1 | 917 | 11,199,202 | 0.973178 | 0.743566 |
| 2 | 916 | 17,283,288 | 0.973178 | 0.743566 |
| 3 | 915 | 17,288 | 0.973178 | 0.743566 |
| 4 | 914 | 30,482 | 0.973178 | 0.743566 |
| 5 | 919 | 15,243 | 0.825153 | 0.692837 |
| 7 | 918 | 8,154 | 0.825153 | 0.692837 |
| 9 | 920 | 17,283,288 | 0.958914 | 1.282161 |
| 11 | 924 | 12,208 | 0.97333 | 0.658246 |
| 13 | 923 | 21,365 | 0.97333 | 0.658246 |
| 15 | 922 | 20,359 | 0.97333 | 0.658246 |
| 17 | 921 | 19,346 | 0.97333 | 0.658246 |
| 19 | 926 | 12,17,208,292 | 0.88396 | 2.389868 |
| 20 | 925 | 12,208 | 0.88396 | 2.389868 |
| 21 | 961 | 4,105 | 0.959217 | 1.280721 |
| 23 | 960 | 8,155 | 0.959217 | 1.280721 |
| 25 | 959 | 12,208 | 0.959217 | 1.280721 |
| 27 | 928 | 1,60 | 0.963734 | 1.070297 |
| 29 | 927 | 32,513 | 0.963734 | 1.070297 |
| 31 | 929 | 6,140 | 0.786553 | 0.678649 |
| 33 | 932 | 1,40,46 | 0.824627 | 0.694182 |
+------+------------+---------------+----------+-----------+
20 rows in set (0.00 sec)
Result is awesome! But problem begins when I want to get all grouped data with faceted
mysql> select id,vacancy_id,prof_area_ids,latitude,longitude from jobVacancy where prof_area_ids=199 group by vacancy_id facet prof_area_ids;
+------+------------+-----------------+----------+-----------+
| id | vacancy_id | prof_area_ids | latitude | longitude |
+------+------------+-----------------+----------+-----------+
| 1 | 917 | 11,199,202 | 0.973178 | 0.743566 |
| 191 | 1004 | 11,196,199 | 0.925335 | 2.768874 |
| 313 | 1072 | 1,11,60,197,199 | 0.963968 | 1.070624 |
| 318 | 1136 | 11,196,199 | 0.96071 | 1.448998 |
| 374 | 1097 | 11,199 | 0.785255 | 0.678504 |
+------+------------+-----------------+----------+-----------+
5 rows in set (0.00 sec)
+---------------+----------+
| prof_area_ids | count(*) |
+---------------+----------+
| 202 | 1 |
| 199 | 12 |
| 11 | 12 |
| 196 | 5 |
| 197 | 3 |
| 60 | 3 |
| 1 | 3 |
+---------------+----------+
7 rows in set (0.02 sec)
Faceted result is incorrect. Because in fact data's count where prof_area_ids=199 must be 5 and not 12. So how I can group field for faceted?
Additionaly
I fount here http://sphinxsearch.com/blog/2013/06/21/faceted-search-with-sphinx/ but just written "If you have a MVA facet, you need to use the GROUPBY() function which returns the actual value on which the grouping was made." and without examle.
mysql> select id,vacancy_id,prof_area_ids,latitude,longitude,GROUPBY() as selected,COUNT(*) from jobVacancy where prof_area_ids=199 group by vacancy_id facet prof_area_ids;
+------+------------+-----------------+----------+-----------+----------+----------+
| id | vacancy_id | prof_area_ids | latitude | longitude | selected | count(*) |
+------+------------+-----------------+----------+-----------+----------+----------+
| 1 | 917 | 11,199,202 | 0.973178 | 0.743566 | 917 | 1 |
| 191 | 1004 | 11,196,199 | 0.925335 | 2.768874 | 1004 | 2 |
| 313 | 1072 | 1,11,60,197,199 | 0.963968 | 1.070624 | 1072 | 3 |
| 318 | 1136 | 11,196,199 | 0.96071 | 1.448998 | 1136 | 3 |
| 374 | 1097 | 11,199 | 0.785255 | 0.678504 | 1097 | 3 |
+------+------------+-----------------+----------+-----------+----------+----------+
5 rows in set (0.00 sec)
+---------------+----------+
| prof_area_ids | count(*) |
+---------------+----------+
| 202 | 1 |
| 199 | 12 |
| 11 | 12 |
| 196 | 5 |
| 197 | 3 |
| 60 | 3 |
| 1 | 3 |
+---------------+----------+
7 rows in set (0.02 sec)
Also faceted result is wrong.
Seems, wanting effectively COUNT(DISTINCT vacancy_id) on the FACET rather than the default COUNT(*), but alas it turns out
... FACET prof_area_ids,COUNT(DISTINCT vacancy_id) AS vacancies BY prof_area_ids
doesnt work. The bit before BY only supports attributes, not custom functions.
... will just have to write it out the long way, with full queries...
select id,vacancy_id,prof_area_ids,latitude,longitude from jobVacancy
where prof_area_ids=199 group by vacancy_id;
SELECT GROUPBY() AS prof_area_id, COUNT(DISTINCT vacancy_id) FROM jobVacancy
WHERE prof_area_ids=199 GROUP BY prof_area_id;
Same results, just slightly more verbose. ie rather than using FACET shorthand, write it
out in full, as multiple seperate queries.
Faceted result is incorrect. Because in fact data's count where prof_area_ids=199 must be 5 and not 12. So how I can group field for faceted?
It looks like you misunderstand how FACET works. It seems to me, that you think it takes as a base the main query's result, but it actually just does another grouping. E.g. here:
mysql> select g, t from idx_mva where t = 11 group by g facet t;
+------+----------+
| g | t |
+------+----------+
| 1 | 11,12 |
| 2 | 11,13,15 |
| 3 | 9,11 |
| 5 | 11,12,15 |
+------+----------+
4 rows in set (0.00 sec)
+------+----------+
| t | count(*) |
+------+----------+
| 12 | 2 |
| 11 | 6 |
| 15 | 4 |
| 13 | 1 |
| 9 | 1 |
| 3 | 1 |
+------+----------+
6 rows in set (0.00 sec)
for t=11 you can see that as in your case it's found 3 times in the 1st query's result, but the count for that is 6 in the FACET's query result. This is because it actually occurs 6 times in the index:
mysql> select * from idx_mva where t = 11;
+------+------+----------+
| id | g | t |
+------+------+----------+
| 2 | 1 | 11,12 |
| 3 | 1 | 11,15 |
| 3 | 2 | 11,13,15 |
| 6 | 3 | 9,11 |
| 8 | 5 | 11,12,15 |
| 11 | 2 | 3,11,15 |
+------+------+----------+
6 rows in set, 1 warning (0.00 sec)
and it happens 3 times in the 1st case only because the t's value is returned only once for each of the groups. You can use group_concat() to see more values from the same group:
mysql> select g, group_concat(to_string(t)) from idx_mva where t = 11 group by g facet t;
+------+----------------------------+
| g | group_concat(to_string(t)) |
+------+----------------------------+
| 1 | 11,12,11,15 |
| 2 | 11,13,15,3,11,15 |
| 3 | 9,11 |
| 5 | 11,12,15 |
+------+----------------------------+
4 rows in set (0.00 sec)
+------+----------+
| t | count(*) |
+------+----------+
| 12 | 2 |
| 11 | 6 |
| 15 | 4 |
| 13 | 1 |
| 9 | 1 |
| 3 | 1 |
+------+----------+
6 rows in set (0.00 sec)
If you want to learn more about faceting here's an interactive course about that - https://play.manticoresearch.com/faceting/
i need help with counting some data
this what i want
| user_id | action_id | count |
-------------------------------------
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 2 | 2 |
| 4 | 3 | 1 |
| 5 | 3 | 2 |
| 6 | 3 | 3 |
| 7 | 4 | 1 |
| 8 | 5 | 1 |
| 9 | 5 | 2 |
| 10 | 6 | 1 |
this is what i have
| user_id | action_id | count |
-------------------------------
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 2 | 1 |
| 4 | 3 | 1 |
| 5 | 3 | 1 |
| 6 | 3 | 1 |
| 7 | 4 | 1 |
| 8 | 5 | 1 |
| 9 | 5 | 1 |
| 10 | 6 | 1 |
i really need it for create some research about second action from users
how do i do it?
thank you
Using ROW_NUMBER should work here:
SELECT
user_id,
action_id,
ROW_NUMBER() OVER (PARTITION BY action_id ORDER BY user_id) count
FROM yourTable
ORDER BY
user_id;
Demo
I want to calculate average in Spotfire only when there are minimum 3 values. if there are no values or just 2 values the average should be blank
Raw data:
Product Age Average
1
2
3 10
4 12
5 13 11
6
7 18
8 19
9 20 19
10 21 20
The only way I could really do this is with 3 calculated columns. Insert these calculated columns in this order:
If(Min(If([Age] IS NULL,0,[Age])) over (LastPeriods(3,[Product]))<>0,1) as [BitFlag]
Avg([Age]) over (LastPeriods(3,[Product])) as [TempAvg]
If([BitFlag]=1,[TempAvg]) as [Average]
This will give you the following results. You can ignore / hide the two columns you don't care about.
RESULTS
+---------+-----+---------+------------------+------------------+
| Product | Age | BitFlag | TempAvg | Average |
+---------+-----+---------+------------------+------------------+
| 1 | | | | |
| 2 | | | | |
| 3 | 10 | | 10 | |
| 4 | 12 | | 11 | |
| 5 | 13 | 1 | 11.6666666666667 | 11.6666666666667 |
| 6 | | | 12.5 | |
| 7 | 18 | | 15.5 | |
| 8 | 19 | | 18.5 | |
| 9 | 20 | 1 | 19 | 19 |
| 10 | 21 | 1 | 20 | 20 |
| 11 | | | 20.5 | |
| 12 | 22 | | 21.5 | |
| 13 | 36 | | 29 | |
| 14 | | | 29 | |
| 15 | 11 | | 23.5 | |
| 16 | 23 | | 17 | |
| 17 | 14 | 1 | 16 | 16 |
+---------+-----+---------+------------------+------------------+
Now I need 5 formula for sum of each column, it works fine but I wish it can be simplified to one formula. Is it possible?
|----+----+----+-----+----|
| a | b | c | d | e |
|----+----+----+-----+----|
| 1 | 2 | 3 | 4 | 5 |
| 6 | 7 | 8 | 9 | 10 |
| 11 | 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 | 20 |
|----+----+----+-----+----|
| 34 | 38 | 42 | 160 | 50 |
|----+----+----+-----+----|
#+TBLFM: #>$5=vsum(#2$5..#-1$5)::#>$4=vsum(#2$1..#-1$4)::#>$3=vsum(#2$3..#-1$3)::#>$2=vsum(#2$2..#-1$2)::#>$1=vsum(#2$1..#-1$1)
This should work:
|----+----+----+----+----|
| a | b | c | d | e |
|----+----+----+----+----|
| 1 | 2 | 3 | 4 | 5 |
| 6 | 7 | 8 | 9 | 10 |
| 11 | 12 | 13 | 14 | 15 |
| 16 | 17 | 18 | 19 | 20 |
|----+----+----+----+----|
| 34 | 38 | 42 | 46 | 50 |
|----+----+----+----+----|
#+TBLFM: #>$1..#>$5=vsum(#2$0..#-1$0)
$0 on the RHS is the current column.
i would like to use the current row number of my org table in cell calculations, either in relation to the table as a whole or in relation to an hline.
if i have the following table:
|---+---+---|
| x | y | z |
|---+---+---|
| 2 | 4 | 8 |
| 2 | 4 | 8 |
| 2 | 4 | 8 |
| 2 | 4 | 8 |
| 2 | 4 | 8 |
| 2 | 4 | 8 |
| 2 | 4 | 8 |
| 2 | 4 | 8 |
| 2 | 4 | 8 |
|---+---+---|
#+TBLFM: #II..#III$1=2::$2=4::$3=$1*$2
how do I change it so that the in the y column each cell is equal to its table row number, as shown if you turn on grid mode in org? the resulting table would look like:
|---+----+----|
| x | y | z |
|---+----+----|
| 2 | 2 | 4 |
| 2 | 3 | 6 |
| 2 | 4 | 8 |
| 2 | 5 | 10 |
| 2 | 6 | 12 |
| 2 | 7 | 14 |
| 2 | 8 | 16 |
| 2 | 9 | 18 |
| 2 | 10 | 20 |
|---+----+----|
(defmath passIndex (x)
x
)
Number rows:
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
#+TBLFM: $1=passIndex(##)
Number columns:
| 1 | 2 | 3 | 4 | 5 |
#+TBLFM: #1=passIndex($#)
Number rows with header row:
| header |
|--------|
| 2 |
| 3 |
| 4 |
| 5 |
#+TBLFM: $1=passIndex(##)