How to group sql range query by clause - postgresql

I am receiving a error while building the index. I recently switched to PostgreSQL and I am using Navicat. The error is coming from the Locations table that has the zip code, city, state, latitude and longitude.
indexing index 'location_core'...
collected 84068 docs, 0.7 MB
sorted 0.1 Mhits, 100.0% done
total 84068 docs, 716268 bytes
total 0.686 sec, 1043676 bytes/sec, 122495.78 docs/sec
indexing index 'user_core'...
ERROR: index 'user_core': sql_range_query: ERROR: column "locations.latitude" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...sers"."updated_at")::int AS "updated_at", RADIANS(locations....
^
(DSN=pgsql://lexi87:***#localhost:5432/dating_development).
total 0 docs, 0 bytes
total 0.007 sec, 0 bytes/sec, 0.00 docs/sec
indexing index 'user_delta'...
ERROR: index 'user_delta': sql_range_query: ERROR: column "locations.latitude" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...sers"."updated_at")::int AS "updated_at", RADIANS(locations....
How exactly can I put them in a group by clause? Any help would be appreciated!

Related

Tableau count number of Records that have the Max value for that field

I have a field where I'd like to count the number of instances the field has the max number for that given column. For example, if the max value for a given column is 20, I want to know how many 20's are in that column. I've tried the following formula but I have received a "Cannot mix aggregate and non-aggregate arguments with this function."
IF [Field1] = MAX([Field1])
THEN 1
ELSE 0
END
Try
IF ATTR([Field1]) = MAX(['Field1'])
THEN 1
ELSE 0
END
ATTR() is an aggreation which will allow you to compare aggregate and non aggregate values. As long as the value you are aggregating with ATTR() contains unique values then this won't have an impact on your data.

How to optimze the query (which tooks to long)

I have a query which I want to know relatively how many locations are up to 100 meters away (relate to all distances):
select person_tbl.tdm, sum((st_distance (person_tbl.geo, location_tbl.geo) < 100)::INT)::FLOAT / count(*)
from persons as person_tbl, locations as location_tbl
where person_tbl.geo is not null
group by person_tbl.tdm
The 2 tables contains geometry indexs:
create index idx on persons using gist(geo)
create index idx on locations using gist(geo)
The first table (persons) the values of geo field is POLYGON
The second table (locations) the values of geo field are POINT Z or POLYGON Z or MULTIPOLYGON Z
The first table persons contains ~2M rows and the second table locations contains ~500 rows
The query took too long (~2 hours).
The values of max_parallel_processes and max_parallel_workers is 8
Is there something I can do to optimize the query calculation time (2 hours seems too long) ?
Is there a better way to write the query ? or do I need to define the indexes in other way ?

How do i limit the number of rows updated in postgres

I am trying to update records in customer table by limiting them to n number of records but i am having an error when i am using offset and limit keywords.
Where do i put
offset 0 limit 1
in update statement sub query as the sub query is like:
update customer set name = 'sample name' where customer_id in (142, 143, 144, 145 offset 0 limit 1);
When i tried executing update statement above, i get an error:
ERROR: syntax error at or near "offset"
Note: limit does not have to be 1, it can be any number and same is true for offset
offset and limit work on rows, not on a list.
You can transform the in() clause to use a subquery that return a row from each of your input
update customer
set name = 'sample name'
where customer_id in (select unnest(array[142, 143, 144, 145]) offset 0 limit 1);

Iterate over current row values in kdb query

Consider the table:
q)trade
stock price amt time
-----------------------------
ibm 121.3 1000 09:03:06.000
bac 5.76 500 09:03:23.000
usb 8.19 800 09:04:01.000
and the list:
q)x: 10000 20000
The following query:
q)select from trade where price < x[first where (x - price) > 100f]
'length
fails as above. How can I pass the current row value of price in each iteration of the search query?
While price[0] in the square brackets above works, that's obviously not what I want. I even tried price[i] but that gives the same error.

DB2 cardinatily estimation for BETWEEN predicate

This is more of an academic question, since I´m interested in the details of DB2 query optimizer.
I got a table with 10000 records and no indexes. COLCARD in SYSCAT.TABLES is showing 10000 and a column named MesFabricacao has the following COLSTAT:
COLCARD HIGH2KEY LOW2KEY TABNAME COLUMN
198 198 2 CARINFO MESFABRICACAO
According to the core manual “DB2PerfTuneTroubleshoot-db2d3e1011.pdf” page 451 the cardinality formula for a between predicate with no histogram would be: (( KEY2 - KEY1) / (HIGH2KEY - LOW2KEY)) * CARD
For the given query "SELECT COUNT(*) FROM STATS.CARINFO WHERE MesFabricacao BETWEEN 1 AND 3", using db2exfmt I see a filter factor of 0.0151258, a value that I can´t explain why the QO is using for estimation.
Does anyone have an explanation on why DB2 is applying this filter factor? I´m using DB2 10.1.0.0.
(Output from db2exfmt)
Predicates:
2) Sargable Predicate,
Comparison Operator: Less Than or Equal (<=)
Subquery Input Required: No
Filter Factor: 0.0151258
Predicate Text:
--------------
(Q1.MESFABRICACAO <= 3)
3) Sargable Predicate,
Comparison Operator: Less Than or Equal (<=)
Subquery Input Required: No
Filter Factor: 1
Predicate Text:
--------------
(1 <= Q1.MESFABRICACAO)
Input Streams:
-------------
1) From Object STATS.CARINFO
Estimated number of rows: 10000
Number of columns: 2
Subquery predicate ID: Not Applicable
Column Names:
------------
+Q1.$RID$+Q1.MESFABRICACAO
Output Streams:
--------------
2) To Operator #2
Estimated number of rows: 151.258
Number of columns: 0