postgresql, My query is too slow. What's the problem? (Takes more than 1 minute) - postgresql

Below is the query that I use.
I want to see the results within 2 or 3 seconds, but it takes more than a minute.
with meta as (
select
item_name
, item_sp
, grade
from
meta_info
where
item_sp = 'Bc_1'
and grade_no = (
select
max(grade_no)
from
meta_info
where
item_sp = 'Bc_1'
)
) select
*
from (
select
m.grade
, i.item_sp
, i.regist_date
, i.serial_key
, row_number() over(partition by i.serial_key order by m.grade) as serial_key_number
from
item_info i, meta m
where
i.item_sp = 'Bc_1'
and i.regist_date = '20210314'
and i.regist = true
and i.item_name = m.item_name
and i.item_sp = m.item_sp
) i
where
not exists (select
serial_key
from
item_info ii
where
ii.item_sp = 'Bc_1'
and ii.regist_date < '20210314'
and i.serial_key = ii.serial_key)
and i.serial_key_number = 1;
The total number of tables used is meta_info and item_info.
meta_info contains the basic information of the product, and item_info is a table that stores the grade and serial key of each product by date.
In the item_info table, the serial key by product is not a key value, so it can be duplicated.
Here's the problem.
A query that compares all serial keys prior to a particular registration date to look up unregistered serial keys once, and extracts only the highest-rated serial key values because there are duplicates of the serial key values by grade.
But there are more than 10 million item_info data.
Below is the table structure.
1. meta_info
item_sp grade item_name grade_no
ac_1 A BOOK 2
ac_1 B FOOD 2
bc_1 A WATER 2
cc_1 C MOUSE 2
. . . .
. . . .
. . . .
2. item_info
item_no(key) item_sp item_name serial_key regist_date regist
1 ac_1 BOOK fgd5756ffdsf 20210314 true
2 ac_1 FOOD bnffdhtj 20210314 true
3 bc_1 WATER fdfh4fsdfsf 20210314 true
4 cc_1 MOUSE htt55434 20210314 true
. . . . . .
. . . . . .
. . . . . .
. . . . . .

Almost all of the time is going to the last index scan in your plan. You should be able to greatly improve it by adding an index on item_info (serial_key, item_sp, regist_date)

Related

Slow running query comparing dates

Could someone please offer some advice. I have the following query that is using roughly 200,000 records. I need to evaluate a 'DateTime' field to evaluate if the revenue occurs during the correct time slot. I am currently using CASE statements to evaluate the DateTime field and it is an absolute pig, it runs over 5 minutes. Is there a faster more efficient way to do this? Note the variables #cur_date, #end_date, #prev_yr_qtr_start, #cur_date_yr_prev etc are all strings and r.pw_ship_date is of type DATETIME. So in essence I'm comparing r.pw_ship_date to strings ie:'2017-01-01 00:00'
Note: it took 4:00 minutes to run this query when I added 'SELECT TOP(500)' for 200,000 records it would take forever.
Thanks in advance
DECLARE #total TABLE
(
acct_number VARCHAR(50),
pro_nbr VARCHAR(50),
sales_rep VARCHAR(50),
bill_to_name VARCHAR(50),
billing_addr1 VARCHAR(50),
billing_addr2 VARCHAR(50),
billing_city CHAR(50),
billing_state CHAR(2),
billing_zip CHAR(10),
cur_month_bills INT,
cur_month_rev DECIMAL(30, 6),
cur_qtr_bills INT,
cur_qtr_rev DECIMAL(30, 6),
prev_yr_qtr_bills INT,
prev_yr_qtr_rev DECIMAL(30, 6),
cur_ytd_bills INT,
cur_ytd_rev DECIMAL(30, 6),
prev_ytd_bills INT
)
INSERT INTO #total
SELECT TOP(50000) f.acct_number ,
r.pro_nbr ,
r.sales_rep ,
r.bill_to_name ,
r.billing_addr1 ,
r.billing_addr2 ,
r.billing_city ,
r.billing_state ,
r.billing_zip ,
'cur_month_bills' = MAX(( CASE WHEN r.pw_ship_date BETWEEN #cur_date AND #end_date THEN 1 ELSE 0 END )) ,
'cur_month_rev' = MAX(ROUND(( CASE WHEN r.pw_ship_date BETWEEN #cur_date AND #end_date THEN f.tot_revenue ELSE 0 END ), 2)) ,
'cur_qtr_bills' = MAX((CASE WHEN r.pw_ship_date BETWEEN #cur_date AND #end_date THEN 1 ELSE 0 END )) ,
'cur_qtr_rev' = MAX(ROUND(CASE WHEN r.pw_ship_date BETWEEN #cur_date AND #end_date THEN f.tot_revenue ELSE 0 END, 2)) ,
'prev_yr_qtr_bills' = MAX(CASE WHEN r.pw_ship_date BETWEEN #prev_yr_qtr_start AND #cur_date_yr_prev THEN 1 ELSE 0 END ) ,
'prev_yr_qtr_rev' = MAX(ROUND(CASE WHEN r.pw_ship_date BETWEEN #prev_yr_qtr_start AND #cur_date_yr_prev THEN f.tot_revenue ELSE 0 END , 2)) ,
'cur_ytd_bills' = MAX(CASE WHEN r.pw_ship_date BETWEEN #first_day_cur_yr AND #end_date THEN 1 ELSE 0 END ),
'cur_ytd_rev' = MAX(ROUND(CASE WHEN r.pw_ship_date BETWEEN #first_day_cur_yr AND #end_date THEN f.tot_revenue ELSE 0 END , 2)) ,
'prev_ytd_bills' = MAX(CASE WHEN r.pw_ship_date BETWEEN #first_day_prev_yr AND #end_date THEN 1 ELSE 0 END )
FROM #summed f
INNER JOIN #raw r ON f.acct_number = r.acct_number AND f.pro_nbr = r.pro_nbr
GROUP BY f.acct_number ,
r.pro_nbr ,
r.sales_rep ,
r.bill_to_name ,
r.billing_addr1 ,
r.billing_addr2 ,
r.billing_city ,
r.billing_state ,
r.billing_zip;
Change your table variables #raw and #summed to temporary tables. Table variables have no statistics and are extremely limited with regard to indexing (you can only have one). Because of this, SQL Server assumes that your table variables have only one row (2012 and older) or 100 rows (2014+). This means that you almost certainly are getting a bad execution plan for your query, and that's going to ruin you.
Once you've changed #raw and #summed into #raw and #summed, put an index on them - at a minimum, index your foreign keys (the fields you're joining on), acct_number and pro-nbr. It may be worth creating a clustered index and/or a primary key as well, but that's something you'll need to experiment with to find the performance you require.
The other thing that is killing your performance is comparing datetimes to strings. This is causing a type conversion and that can drag you down significantly. If you're working with a date/time, use the appropriate data type - not a string that looks like a date.
If this is still not running quickly enough, move your CASE statements out of your aggregate functions.
MAX(( CASE WHEN r.pw_ship_date BETWEEN #cur_date AND #end_date THEN 1 ELSE 0 END ))
Move the CASE statement into the query that populates #raw.pw_ship_date so that when you're performing the aggregate, you're just looking at integers all the way down.

How to extract number from a string

I've a string like 'intercompany creditors {DEMO[[1]]}'. I want to extract only the numbers from the string, in example just '1'.
How to do this in Invantive SQL?
You should be able to do so with substr (get some piece of text from specific positions in the text) and instr (get the position from a specific piece of text inside some other text):
select substr
( d
, instr(d, '[[') + 2
, instr(d, ']]') - instr(d, '[[') - 2
)
from ( select 'intercompany creditors {DEMO[[1]]}' d
from dual#DataDictionary
) x

FMP 14 - Auto Populate a Field based on a calculation

I am using FMP 14 and would like to auto-populate field A based on the following calulation:
If ( Get ( ActiveLayoutObjectName ) = "tab_Visits_v1" ; "1st" ) or
If ( Get ( ActiveLayoutObjectName ) = "tab_Visits_v2" ; "2nd" ) or
If ( Get ( ActiveLayoutObjectName ) = "tab_Visits_v3" ; "3rd" ) or
If ( Get ( ActiveLayoutObjectName ) = "tab_Visits_v4" ; "4th" ) or
If ( Get ( ActiveLayoutObjectName ) = "tab_Visits_v5" ; "5th" ) or
If ( Get ( ActiveLayoutObjectName ) = "tab_Visits_v6" ; "6th" )
The above code is supposed to auto-populate the value 1st, 2nd, 3rd ... in field A depending on the name of the object the Get (ActiveLayoutObjectName) function returns. I have named each of my tabs, but the calculation is only returning 0.
Any help would be appreciated.
thanks.
The way your calculation is written makes very little sense. Each one of the If() statements returns a result of either "1st", "2nd", etc. or nothing (an empty string). You are then applying or to all these results. Since only of them can be true, your calculation is essentially doing something like:
"" or "2nd" or "" or "" or "" or ""
which happens to return 1 (true), but has no useful meaning.
You should be using the Case() function here:
Case (
Get ( ActiveLayoutObjectName ) = "tab_Visits_v1" ; "1st" ;
Get ( ActiveLayoutObjectName ) = "tab_Visits_v2" ; "2nd" ;
Get ( ActiveLayoutObjectName ) = "tab_Visits_v3" ; "3rd" ;
Get ( ActiveLayoutObjectName ) = "tab_Visits_v4" ; "4th" ;
Get ( ActiveLayoutObjectName ) = "tab_Visits_v5" ; "5th" ;
Get ( ActiveLayoutObjectName ) = "tab_Visits_v6" ; "6th"
)
Note also that a calculation field may not always refresh itself as a result of user switching a tab. This refers to an unstored calculation field; if you are trying to use this as the formula to be auto-entered into a "regular' (e.g. Text) field, it will never update.
Added:
Here is our situation. We see a patient a maximum of 6 times. We have
a tab for each one of those 6 visits.
I would suggest you use a portal to a related table of Visits instead of a tab control. A tab control is designed to display fixed components of user interface - not dynamic data. And certainly not data split into separate records. You should have only one unique record for each patient - and as many records for each patient's visits as may be necessary (potentially unlimited).
If you like, you can use the portal rows as buttons to select a specific visit to view in more detail (similar to a tab control, except that the portal shows the "tabs" as vertical rows). A one-row portal to the same Visits table, filtered by the user selection, would work very well for this purpose, I believe.
With 1. .... it would be easy:
Right (Get ( ActiveLayoutObjectName ) ; 1) & "."
Thanks for pointing out, that my first version does not work.

FileMaker Pro 12 - Using Commas As Separators For Address Export

FMP12
I need to use commas to separate data from customer addresses. The address is imported into a notes field like this: "1234 East Palm Ave, Berkeley, Ca 94150".
My script needs to recognize if the street name is 2 or 3 words, and if the city name is 1 or 2 words. The info gets broken up into a couple different Set Variables $streetName, $CityName, $ZipCode, and exported to create a new customer profile.
I was trying to do that with a string search function that finds the position of the first and second instand of each "," but my function isn't working. Any hints about which function to use would be greatly appreciated.
Assuming the formatting is consistent, try:
$streetName =
Let (
tokens = Substitute ( Address ; ", " ; ¶ )
;
GetValue ( tokens ; 1 )
)
$cityName =
Let (
tokens = Substitute ( Address ; ", " ; ¶ )
;
GetValue ( tokens ; 2 )
)
$zipCode =
RightWords ( Address ; 1 )

DB2 Case Sensitivity

I'm having great difficultly making my DB2 (AS/400) queries case insensitive.
For example:
SELECT *
FROM NameTable
WHERE LastName = 'smith'
Will return no results, but the following returns 1000's of results:
SELECT *
FROM NameTable
WHERE LastName = 'Smith'
I've read of putting SortSequence/SortType into your connection string but have had no luck... anyone have exepierence with this?
Edit:
Here's the stored procedure:
BEGIN
DECLARE CR CURSOR FOR
SELECT T . ID ,
T . LASTNAME ,
T . FIRSTNAME ,
T . MIDDLENAME ,
T . STREETNAME || ' ' || T . ADDRESS2 || ' ' || T . CITY || ' ' || T . STATE || ' ' || T . ZIPCODE AS ADDRESS ,
T . GENDER ,
T . DOB ,
T . SSN ,
T . OTHERINFO ,
T . APPLICATION
FROM
( SELECT R . * , ROW_NUMBER ( ) OVER ( ) AS ROW_NUM
FROM CPSAB32.VW_MYVIEW
WHERE R . LASTNAME = IFNULL ( #LASTNAME , LASTNAME )
AND R . FIRSTNAME = IFNULL ( #FIRSTNAME , FIRSTNAME )
AND R . MIDDLENAME = IFNULL ( #MIDDLENAME , MIDDLENAME )
AND R . DOB = IFNULL ( #DOB , DOB )
AND R . STREETNAME = IFNULL ( #STREETNAME , STREETNAME )
AND R . CITY = IFNULL ( #CITY , CITY )
AND R . STATE = IFNULL ( #STATE , STATE )
AND R . ZIPCODE = IFNULL ( #ZIPCODE , ZIPCODE )
AND R . SSN = IFNULL ( #SSN , SSN )
FETCH FIRST 500 ROWS ONLY )
AS T
WHERE ROW_NUM <= #MAXRECORDS
OPTIMIZE FOR 500 ROW ;
OPEN CR ;
RETURN ;
Why not do this:
WHERE lower(LastName) = 'smith'
If you're worried about performance (i.e. the query not using an index), keep in mind that DB2 has function indexes, which you can read about here. So essentially, you can create an index on upper(LastName).
EDIT
To do the debugging technique I discussed in the comments, you could do something like this:
create table log (msg varchar(100, dt date);
Then in your SP, you can insert messages to this table for debugging purposes:
insert into log (msg, dt) select 'inside the SP', current_date from sysibm.sysdummy1;
Then after the SP runs, you can select from this log table to see what happened.
If you want case-insensitive in your procedure, try using this option in it:
SET OPTION SRTSEQ = *LANGIDSHR ;
You should also create an index to support it for performance. Create the index when you have *LANGIDSHR as a connection attribute, and the shared-weight index should then be available to later jobs. (There are various ways to get the appropriate setting into effect.)
*LANGIDSHR relates to the language-ID for your jobs. Characters in the character set that might be considered as "equals", such as 'A' and 'a' or 'ü' and 'u', should be given equal weights (shared) and so select together.
I did something similar when I wanted a case insensitive search. I used UPPER(mtfield) = 'SEARCHSTRING'. I know this works.
See: https://stackoverflow.com/a/47181640/5507619
Database setting
There is a database config setting you can set at database creation. It's based on unicode, though.
CREATE DATABASE yourDB USING COLLATE UCA500R1_S1
The default Unicode Collation Algorithm is implemented by the UCA500R1 keyword without any attributes. Since the default UCA cannot simultaneously encompass the collating sequence of every language supported by Unicode, optional attributes can be specified to customize the UCA ordering. The attributes are separated by the underscore (_) character. The UCA500R1 keyword and any attributes form a UCA collation name.
The Strength attribute determines whether accent or case is taken into account when collating or comparing text strings. In writing systems without case or accent, the Strength attribute controls similarly important features.
The possible values are: primary (1), secondary (2), tertiary (3), quaternary (4), and identity (I). To ignore:
accent and case, use the primary strength level
case only, use the secondary strength level
neither accent nor case, use the tertiary strength level
Almost all characters can be distinguished by the first three strength levels, therefore in most locales the default Strength attribute is set at the tertiary level. However if the Alternate attribute (described below) is set to shifted, then the quaternary strength level can be used to break ties among white space characters, punctuation marks, and symbols that would otherwise be ignored. The identity strength level is used to distinguish among similar characters, such as the MATHEMATICAL BOLD SMALL A character (U+1D41A) and the MATHEMATICAL ITALIC SMALL A character (U+1D44E).
Setting the Strength attribute to higher level will slow down text string comparisons and increase the length of the sort keys.
Examples:
UCA500R1_S1 will collate "role" = "Role" = "rôle"
UCA500R1_S2 will collate "role" = "Role" < "rôle"
UCA500R1_S3 will collate "role" < "Role" < "rôle"
This worked for me. As you can see, ..._S2 ignores case, too.
Using a newer standard version, it should look like this:
CREATE DATABASE yourDB USING COLLATE CLDR181_S1
Collation keywords:
UCA400R1 = Unicode Standard 4.0 = CLDR version 1.2
UCA500R1 = Unicode Standard 5.0 = CLDR version 1.5.1
CLDR181 = Unicode Standard 5.2 = CLDR version 1.8.1
If your database is already created, there is supposed to be a way to change the setting.
CALL SYSPROC.ADMIN_CMD( 'UPDATE DB CFG USING DB_COLLNAME UCA500R1_S1 ' );
I do have problems executing this, but for all I know it is supposed to work.
Generated table row
Other options are e.g. generating a upper case row:
CREATE TABLE t (
id INTEGER NOT NULL PRIMARY KEY,
str VARCHAR(500),
ucase_str VARCHAR(500) GENERATED ALWAYS AS ( UPPER(str) )
)#
INSERT INTO t(id, str)
VALUES ( 1, 'Some String' )#
SELECT * FROM t#
ID STR UCASE_STR
----------- ------------------------------------ ------------------------------------
1 Some String SOME STRING
1 record(s) selected.