idh_hist query is very slow while searching with date - progress-4gl

I am trying to write a query to search thru MFG/PRO invoice table 'idh_hist' for specific date range. It is running very slow when added the date condition. But when I put off the date condition, it is very fast. Can you please suggest ways to write a query on idh_hist that runs reasonably faster with conditions.
Following is my query:
for each idh_hist no-lock where idh_domain = "d0002"
and idh_due_date = TODAY:
/* display code here... */
end.
Thanks in advance!
Database Index:
Flags Index Name Cnt Field Name
----- --------------------- ---- ---------------------
idh_fsm_type 4 + idh_domain
+ idh_fsm_type
+ idh_nbr
+ idh_line
pu idh_invln 4 + idh_domain
+ idh_inv_nbr
+ idh_nbr
+ idh_line
idh_part 4 + idh_domain
+ idh_part
+ idh_inv_nbr
+ idh_line
u oid_idh_hist 1 + oid_idh_hist

You do not appear to have an index that uses idh_due_date. You will need to add such an index.
The 4gl uses rules to select indexes based on the WHERE clause. The most important rule is that leading components of the index which have equality matches will be used.
The query you have shown only has one such match on idh_domain. So then tie breaker rules are applied. This will result in the idh_invln index being chosen.
As it is, to satisfy your query all records that match the "idh_domain" field need to be searched. (If you only have one domain that means that you are doing a table scan.)
You probably want to add an index on idh_domain and idh_due_date. That would be a perfect match for your query.

Related

How do I get unique values of one column based on another column using the insert database query in Anylogic?

How do I get unique values of one column based on another column using the query?
I tried using
(double)selectFrom(tasks).where(tasks.tasks_type.eq()).uniqueResult(tasks.task_cycle_time_hr);
I want to automate this and make sure that all the values of task_type are being read and a unique value for each of the tasks_type is being returned!
For all the values in the column task_type, I require a unique value from the column task_cycle_time_hr.
I don't really understand why you're trying to do this in one query.
If you want to get the cycle time (task_cycle_time_hr column) for each task type (tasks_type column), just do queries in a loop for each possible tasks_type value. If you don't know those a priori, do queries for each value returned by a query of the task type values, which would look something like
for (String taskType : selectFrom(tasks).list(tasks.tasks_type)) {
double cycleTime = (double) selectFrom(tasks)
.where(db_table.tasks_type.eq(taskType))
.firstResult(tasks.task_cycle_time_hr);
traceln("Task type " + taskType + ", cycle time " + cycleTime);
}
But this just amounts to querying all rows and reading the task type and cycle time values from each, so you wouldn't normally do it like this: you'd just have a single query looping through all the full rows instead...
List<Tuple> rows = selectFrom(tasks).list();
for (Tuple row : rows) {
traceln("Task type " +
row.get(tasks.tasks_type) + ", cycle time " +
row.get(tasks.task_cycle_time_hr));
}
NB: I assume you don't have any rows with duplicate task types because then the whole exercise doesn't make sense unless you want only the first row for each task type value, or want some kind of aggregate (e.g., sum) of the cycle time values for each given task type. You were trying to use uniqueResult, which may mean you want to get a value if there is exactly one row (for a given task type) and 'no result otherwise', but uniqueResult throws an exception (errors) if there isn't exactly one row (so you can't use that directly like that). In that case one way (there are others, some probably slightly better) would be to do a count first to check; e.g. something like
for (String taskType : selectFrom(tasks).list(tasks.tasks_type)) {
int rowCount = (int) selectFrom(tasks)
.where(db_table.task.eq(taskType))
.count();
if (rowCount == 1) {
double cycleTime = (double) selectFrom(tasks)
.where(db_table.tasks_type.eq(taskType))
.firstResult(tasks.task_cycle_time_hr);
traceln("Task type " + taskType + ", unique cycle time " + cycleTime);
}
}
Import your excel sheet into the AnyLogi internal DB and then make use of the DB wizard that will take you step by step to write the code to retrieve the data you want
(double) selectFrom(data)
.where(data.tasks.eq("T1"))
.firstResult(data.task_cycle_time_hr)

How to increment value in counter table

In my table I have the following scheme:
id - integer | date - text | name - text | count - integer
I want just to count some actions.
I want put 1 when date = '30-04-2019' not exist yet.
I want put +1 when is row already exist.
My idea is:
UPDATE "call" SET count = (1 + (SELECT count
FROM "call"
WHERE date = '30-04-2019'))
WHERE date = '30-04-2019'
But it is not working when row doesn't exist.
It is possible without some extra triggers, etc...
You can use a writeable CTE to achieve this. Additionally the UPDATE statement can be simplified to a simple set count = count + 1 there is no need for a sub-select.
with updated as (
update "call"
set count = count + 1
where date = '30-04-2019'
returning id
)
insert into "call" (date, count)
select '30-04-2019', 1
where not exists (select *
form updated);
If the update did not find a row, the where not exists condition will be true and the insert will be executed.
Note that the above is not safe for concurrent execution from multiple transactions. If you want to make this safe, create a unique index on the date column. Then use an INSERT ... ON CONFLICT instead:
insert into "call" (date, count)
values ('30-04-2019', 1)
on conflict (date)
do update
set count = "call".count + 1;
Again: the above requires a unique index (or constraint) on the date column.
Unrelated to the immediate problem, but: storing dates in a text column is a really, really bad idea. You should change your table definition and change the data type for the "date" column to date.

How to write a date range condition

How would I write the date range condition correctly for the following query: list all instruments from table "asset", where the "maturity_dt" > 1 year:...
in English it sounds like :" AND asset.maturity_dt >= Today + 365.."
If you are using MySQL you will be needing a query like,
Select * from asset where DATEDIFF(maturity_dt, now())>365;

sphinx emit (discard) search fields

Suppose we have some search params like (author, genre, cost) and we have to get N=15 rows
Query : select ... where author=a and genre=b and cost=b LIMIT N
We have to get N rows, but we found only 2 rows. Then we have to emit param cost.
Query : select ... where author=a and genre=b LIMIT N
now we have 10 < N rows, so we have to emit author
Query : select ... where author=a LIMIT N and so on..
How to make it in right way (i think to make multi queries is expensive,
to make query like: select if (author=a and genre=b and cost=c, 1, 0) as f, if (author=a and genre=b, 1, 0) as s, ... order by f desc, s desc, ...
is expensive too because table has more 500 000 rows
You can probably make it a bit more efficient, with
select ...,author=a + genre=b + cost=c as f from order by d desc
(if want to maintain priority can do (author=a * 4) + ... etc)
But in general, you have no MATCH() so the query will ALWAYS be a full-table scan. It will have to inspect and potentially sort EVERY row in the table.
There is no way to make it truly efficient. (Other than pre-computing values and storing them in the index - could even precompute values in fields to take advantage of the full-text index)

filtering on a range of values in a db column with sqlalchemy orm

I have a postgresql database and in one particular table, with many rows. One column in this table, called data, is a float array, REAL[], and gets filled with an array of ~4500 elements. I want to access this table through some query via SQLAlchemy and the ORM.
How do I select all rows in the table where a subset of this column satisfies some condition, e.g.contains a range of values? Like I want to select all rows where the data contains values >= 10, or values between >=10 and <=20.
Can I do this with a straight session query like
rows = session.query(Table).filter(Table.data.(some conditional)).all()
where my conditional is something like "VALUES >= 10 and VALUES <= 20"?
Or do I need to define some special methods, or setup, when I'm defining my SQLAlchemy table class. For example, I have my table set up as
class Table(Base):
__tablename__ = 'table'
__table_args__ = {'autoload' : True, 'schema' : 'testdb', 'extend_existing':True}
data = deferred(Column(ARRAY(Float)))
def __repr__(self):
return '<Table (pk={0})>'.format(self.pk)
Ideally I'd like to set it up so I can just do simple filtering in my session.query calls. Is this possible? I'm not super familiar with the ORM, so maybe it is?
I've had a look at the ARRAY Comparator sqlalchemy docs but those only seem to work on exact values. My data is precise to 6 sigfigs, and I don't know the exact values ahead of time.
What's the best way to do this? Thanks.
EDIT:
Based on the below comment, here is the code I'm using in attempting to select all rows (out of 1000) that have data (from 1 column) >= 1.0. There should be 537 rows.
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le)).all()
This gives the correct subset number. len(rows) = 537. However, I don't understand the logic of with this operator, where to select data >=1.0 , I use the le operator? Also, along those same lines, there should be 234 rows that have data between the values >=1.0 and <1.0, but this statement fails to give the correct subset..
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le)).filter(datadb.Table.data.any(1.2,operator=operators.ge)).all()
* EDIT 2 *
Here's an example of my database Table with a few rows. pk is an integer, and data is a real[].
db datadb
schema Table
pk data
0 [0.0,0.0,0.5,0.3,1.3,1.9,0.3,0.0,0.0]
1 [0.1,0.0,1.0,0.7,1.1,1.5,1.2,0.3,1.4]
2 [0.0,0.6,0.4,0.3,1.6,1.7,0.4,1.3,0.0]
3 [0.0,0.1,0.2,0.4,1.0,1.1,1.2,0.9,0.0]
4 [0.0,0.0,0.5,0.3,0.2,0.1,0.7,0.3,0.1]
I have 5 rows, 4 of them have data with values >= 1.0, while just 2 have values in the range >= 1.0 and <= 1.2. The query I would do to grab the rows is in the first case
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le)).all()
This should return the 4 rows, at pk=0,1,2,3. This query does what I expect. The second case
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le)).filter(datadb.Table.data.any(1.2,operator=operators.ge)).all()
and should return the 2 rows at pk=1,3. However this query just returns the 4 rows from the first query. For the second query, I also tried
rows = session.query(datadb.Table).filter(datadb.Table.data.any(1.0,operator=operators.le),datadb.Table.data.any(1.2,operator=operators.ge)).all()
which also didn't work.
Please read documentation on ARRAY.Comparator, according to which you should be able to do the following:
rows = (session.query(Table)
.filter(Table.data.any(10, operator=operators.le))
.filter(Table.data.any(20, operator=operators.ge)
).all()
EDIT:
# combined filter does not work,
# but applying one or the other is still useful as it reduces the result set
q = (session.query(MyTable)
.filter(MyTable.data.any(1.0, operator=operators.le))
# .filter(MyTable.data.any(1.2, operator=operators.ge))
)
# filter in memory
items = [_row for _row in q.all()
if any(1.0 <= item <= 1.2 for item in _row.data)]
for item in items:
print(item)