Best Way to Sequentially Parse Through a Table in T-SQL - tsql

I'm writing a stored procedure in SQL Server and hoping someone can suggest a more computationally efficient way to handle this problem:
I have a table of Customer Orders (i.e., "product demand") data that contains 3000 line items. Each record expresses the Order Qty for a specific product.
I also have another table of Production Orders (i.e., "product supply") data that contains about 200 line items. Each record expresses the Qty Available for each specific product.
The problem is that there is typically less supply than demand and, therefore, the Custom Order table contains an Allocation Priority value that shows each Customer Order's position in line to receive product.
What's the best way to allocate Qty Available in Production Orders to the Order Qty in Customer Orders? Note that you can't allocate more to each Customer Order than has been ordered.
I can do this by creating a WHILE loop and doing the allocation product-by-product, line-by-line but it is very slow.
Is there a faster set-based way to approach this problem?

I don't have data to test against. This would not try and fill partial qty.
select orders.custID, orders.priority, orders.prodID, orders.qty, SUM(cumu.qty) as 'cumu'
from orders
join orders as cumu
on cumu.prodID = orders.prodID
and cumu.priority <= orders.priority
join available
on availble.prodID = orders.prodID
group by orders.custID, orders.priority, orders.prodID, orders.qty
having SUM(cumu.qty) < availble.qty
order by orders.custID, orders.priority, orders.prodID

Related

How to avoid customer's order history being changed in MongoDB?

I have two collections
Customers
Products
I have a field called "orders" in each of my customer document and what this "orders" field does is that it stores a reference to the product Id which was ordered by a customer, now my question is since I'm referencing product Id and if I update the "title" of that product then it will also update in the customer's order history since I can't embed each order information since a customer may order thousands of products and it can hit 16mb mark in no time so what's the fix for this. Thanks.
Create an Orders Collection
Store ID of the user who made the order
Store ID of the product bought
I understand you are looking up the value of the product from the customer entity. You will always get the latest price if you are not storing the order/price historical transactions. Because your data model is designed this way to retrieve the latest price information.
My suggestion.
Orders place with product and price always need to be stored in history entity or like order lines and not allow any process to change it so that when you look up products that customers brought you can always get the historical price and price change of the product should not affect the previous order. Two options.
Store the order history in the current collection customers (or top say 50 order lines if you don't need all of history(write additional logic to handle this)
if "option1" is not feasible due to large no. of orders think of creating an order lines transaction table and refer order line for the product brought via DBref or lookup command.
Note: it would have helped if you have given no. of transactions in each collection currently and its expected rate of growth of documents in the collection QoQ.
You have orders and products. Orders are referencing products. Your problem is that the products get updated and now your orders reference the new product. The easiest way to combat this issue is to store full data in each order. Store all the key product-related information.
The advantage is that this kind of solution is extremely easy to visualize and implement. The disadvantage is that you have a lot of repetitive data since most of your products probably don't get updated.
If you store a product update history based on timestamps, then you could solve your problem. Products are identified now by 3 fields. The product ID, active start date and active end date. Or you could configure products in this way: product ID = product ID + "Version X" and store this version against each order.
If you use dates, then you will query for the product and find the product version that was active during the time period that the order occurred. If you use versions against the product, then you will simply query the database for the particular version of the product itself. I haven't used mongoDb so I'm not sure how you would achieve this in mongoDb exactly. Naively however, you can modify the product ID to include the version as well using # as a delimiter possibly.
The advantage of this solution is that you don't store too much of extra data. Considering that products won't be updated too often, I feel like this is the ideal solution to your problem

#BatchFetch type JOIN

I'm confused about this annotation for an entity field that is of type of another entity:
#BatchFetch(value = BatchFetchType.JOIN)
In the docs of EclipseLink for BatchFetch they explain it as following:
For example, consider an object with an EMPLOYEE and PHONE table in
which PHONE has a foreign key to EMPLOYEE. By default, reading a list
of employees' addresses by default requires n queries, for each
employee's address. With batch fetching, you use one query for all the
addresses.
but I'm confused about the meaning of specifying BatchFetchType.JOIN. I mean, doesn't BatchFetch do a join in the moment it retrieves the list of records associated with employee? The records of address/phone type are retrieved using the foreign key, so it is a join itself, right?
The BatchFetch type is an optional parameter, and for join it is said:
JOIN – The original query's selection criteria is joined with the
batch query
what does this means? Isn't the batch query a join itself?
Joining the relationship and returning the referenced data with the main data is a fetch join. So a query that brings in 1 Employee that has 5 phones, results in 5 rows being returned, with the data in Employee being duplicated for reach row. When that is less ideal, say a query over 1000 employees, you resort to a separate batch query for these phone numbers. Such a query would run once to return 1000 employee rows, and then run a second query to return all employee phones needed to build the read in employees.
The three batch query types listed here then determine how this second batch query gets built. These will perform differently based on the data and database tuning.
JOIN - Works much the same away a fetch join would, except it only returns the Phone data.
EXISTS - This causes the DB to execute the initial query on Employees, but uses the data in an Exists subquery to then fetch the Phones.
IN - EclipseLink agregates all the Employee IDs or values used to reference Phones, and uses them to filter Phones directly.
Best way to find out is always to try it out with SQL logging turned on to see what it generates for your mapping and query. Since these are performance options, you should test them out and record the metrics to determine which works best for your application as its dataset grows.

PostgreSQL: Returning ordered rows after a specific ID

Scenario:
I am displaying a table of records. It initially displays the first 500 with "show more" at the bottom, which returns the next 500.
Issue:
If between initial display and clicking "show more" 1 record is added, that will cause "order by date, offset 500, limit 500" to overlap by 1 row.
I'd like to "order by date, offset until 'id of last row shown', limit 500"
My row IDs are UUIDs. I am open to alternative approaches that achieve the same result.
If you can order by ID, you can paginate using
where id > $last_seen_id limit 500
but that's not going to be useful where you're sorting by date.
Sort stability!
I really hope that "date" actually means "timestamp" though, otherwise your ordering will be unstable and you can miss rows in pagination; you'll have to order by date, id to get stable ordering if it's really a date, and should probably do so even for timestamp.
State on client
One option is to push the state out to the client. Have the client remember the last-seen (date,id) tuple, and use:
where date > $last_seen_date and id > $last_seen_id limit 500
Cursors
Do you care about scalability? If not, you can use a server-side cursor. Declare the cursor for the full query, without the LIMIT. Then FETCH chunks of rows as requested. To do this your app must have a way to consistently bind a connection to a specific user's requests, though, and not to reset that connection or return it to the pool between requests. This might not be practical with your pool/framework, but is probably the best solution if you can do it.
Temp tables
Another even less scalable option is to CREATE TABLE sessiondata.myuser_myrequest_blah AS SELECT .... then paginate that table. It's guaranteed not to change. This avoids the difficulty of needing to keep a consistent connection across requests, but will have a very slow first-request response time and is completely impractical for large user counts or large amounts of data.
Related questions
Handling paging with changing sort orders
Using "Cursors" for paging in PostgreSQL
How to provide an API client with 1,000,000 database results?
i think you can use a subquery in the where to accomplish this.
e.g. given you're paginating through a users table, and you want the records after a given user:
SELECT *
FROM users
WHERE created_at > (
SELECT created_at
FROM users
WHERE users.id = '00000000-1111-2222-3333-444444444444'
LIMIT 1
)
ORDER BY created_at DESC limit 5;

TSQL - Deleting with Inner Joins and multiple conditions

My question is a variation on one already asked and answered (TSQL Delete Using Inner Joins) but I have a different level of complexity and I couldn't see a solution to it.
My requirement is to delete Special Prices which haven't been accessed in 90 days. Special Prices are keyed on Customer ID and Product ID and the products have to matched to a Customer Order Detail table which also contains a Customer ID and a Product ID. I want to write one function that will look at the Special Price table for each Customer, compare each Product for that Customer with the Customer Order Detail table and if the Maximum Order Date is more than 90 days earlier than today, delete it from the Special Price table.
I know I can use a CURSOR (slow but effective) but would prefer to have a single query like the one in the TSQL Delete Using Inner Joins example. Any ideas and/or is more information required?
I cannot dig more on the situation of your system but i think and if it is ok for you, check MERGE STATEMENT, it might be a help instead of using cursors. check this Link MERGE STATEMENT

Pagination logic in Mainframe CICS

Here is my requirement.
Front (Client) end will do a search based on predefined conditions (for instance: customer id, account number, first name, last name, etc). I need to get the data corresponding to this request from a db2 database and send it back to them (Server). We use CICS channels and containers to pass requests and responses between the Client and Server.
Front end needs the data ordered by: Receive date descending, Customer id Ascending, Account number Ascending. Data are fetched in pages of 500 records. For example, if for a search request from front end would retrieve 50,000 records from the db2 database, we need to return this data in 500 record "pages". For pagination concept, we use the field security deposit number which is primary key to our database but the sorting order is not based on this field.
I would like to know whether we can use scrollable cursor logic in CICS to implement pagination.
Please note that I do not prefer to go for internal array bubble sort to send the data in response as it would degrade performance. I like to do it via query logic.any thoughts?
Example (Initial Front end input request):
Customer id : A
First time request (To identify whether it is first time or next or previous request for pagination)
First security deposit number : 0
Last security deposit number : 0
Since this is first time request, both this field will be having zero from front end and we need to retrieve records from database based on condition of security deposit > 0
Db2 database:
There are 700 records for this criteria
Mainframe response for first time:
We will send the first 500 records
Front end will then send request for getting next set of records which will contain:
Customer id: A
Next request
First security deposit numbr: 0
Last security deposit number : 17980
So for this detail, if I query my datbase based on security deposit number > 17980, it may result in duplicate records listing in the screen once again since our sorting order in database is not based on security deposit number
How to impelement this logic??
Many Client/Server applications in an IBM Mainframe environment involve psuedo conversational CICS transactions.
If you are using CICS in psueudo conversational mode it
is not possible for the Server to hold cusors when it RETURNs to the Client. Therefore scrollable cusors
are of little use in this environment. So to answer your basic question: No scrollable cursors cannot be used here.
The "trick" here is to create an SQL predicate in the Server that is restartable. It will then pick up rows in the correct order from any given
stating point. When the Client calls your Server it must pass all of the positioning information to your Server.
Typically, on a first call from a Client all of the positioning values are set to cause the cursor to
position itself starting with what must be the the first row. The Server then pulls in a "page" worth of data
and returns it to the Client. On the next page forward request the Client sets these positioning values to
the last row it displayed and calls the Server for the next "page" of data.
In your situation I would assume that the page forward cursor would look something like this, all the
variables prefixed with RESTART... are what the Client must provide to the Server to start the cursor
in the correct position.
DECLARE CURSOR Page-forward FOR
SELECT Receive_Date, Customer_id, Account_Nbr, Security_Dep_Id
FROM Table_Name
WHERE ( (Receive_Date < :RESTART-RCV-DT)
OR (Receive_Date = :RESTART-RCV-DT AND
Customer_Id > :RESTART-CUSTOMER-ID)
OR (Receive_Date = :RESTART-RCV-DT AND
Customer_Id = :RESTART-CUSTOMER-ID AND
Account_Nbr > :RESTART-ACCT-NBR)
OR (Receive_Date = :RESTART-RCV-DT AND
Customer_Id = :RESTART-CUSTOMER-ID AND
Account_Nbr = :RESTART-ACCT-NBR AND
Security_Dep_Id > :RESTART-SEC-DEP-ID))
ORDER BY 1 DESC, 2 ASC , 3 ASC, 4 ASC
For the initial call the Client would have passed something like '9999-12-31' as the RESTART-RCV-DT, zero
for the RESTART-CUSTOMER-ID, RESTART-ACCT-NBR and SEC-DEP-ID (assuming these are all numeric). If you look at
the cursor predicate carefully you can verify that there cannot be any rows prior to these values - therefore this
will return the first page of data. If the Client needs to page forward after this, it must tell the Server to start
with the next row after the last one it received. To do this it would populate the RESTART... variables with
the values from the last row on the page it just
displayed. This process will drive the cursor selects forward one page at a time.
When paging up, the process is reversed (you will need a second cursor to support this, and the Client needs to tell you which direction to page: Forward or Back). The Client
will need to populate the RESTART variables with the first row it recieved from the Server. The trick
for the Server on a page up request is to return the data
to the Client in reverse order. You may have to populate the data page passed back
to the Client in reverse order (ie. put the first row retrieved into the last row of the paging area shared between
the Client and the Server). The page backward cursor would look something like:
DECLARE CURSOR Page-backward FOR
SELECT Receive_Date, Customer_id, Account_Nbr, Security_Dep_Id
FROM Table_Name
WHERE ( (Receive_Date > :RESTART-RCV-DT)
OR (Receive_Date = :RESTART-RCV-DT AND
Customer_Id < :RESTART-CUSTOMER-ID)
OR (Receive_Date = :RESTART-RCV-DT AND
Customer_Id = :RESTART-CUSTOMER-ID AND
Account_Nbr < :RESTART-ACCT-NBR)
OR (Receive_Date = :RESTART-RCV-DT AND
Customer_Id = :RESTART-CUSTOMER-ID AND
Account_Nbr = :RESTART-ACCT-NBR AND
Security_Dep_Id < :RESTART-SEC-DEP-ID))
ORDER BY 1 ASC, 2 DESC , 3 DESC, 4 DESC
As has been pointed out in other answers, this type of paging process does not manage or detect concurrent
updates to the database that may occur duing paging transactions. That is another topic for another day...
Developing Restartable Cursors
The the key to building a paging Server is to develop a cursor that is restartable from a set of values received
from a Client transaction. This leaves control of cursor positioning and direction with the Client.
It also means the Client must receive all critical positioning data from the Server even though
the Client might not actually
use these data for any other purpose (e.g. From your question I got the impression that the Client may not require
the Security Deposit Id except to supply as a positioning parameter for your Server)
To build a paging Server you need to know
what the required sorting order of the data are (e.g. Receive Date Descending then Customer Id Ascending then
Account Number Ascending).
You also need know the set of data that uniquely identify a row
returned by the cursor. In your case that would be the Security Deposit Id (this is the primary key for the
table you are selecting from so it must be unique for each and every row in that table). Knowing this you then build a
cursor predicate (the stuff in the WHERE clause) that will return data needed by the Client in the required sort order that
also includes
the full positioning key (i.e. Security Deposit Id). In the event that two or more returned rows may contain identical data if
the final positioning key were elimiminated makes it important that the positioning key be included as a sort condition.
It doesn't matter if it is ascending or descending, but it needs to be included on the sort to ensure consistent
order of data retrieval.
A fairly simple formula may be followed to build the predicate for a restartable cusor needed to
support paging Servers. Basically this is a cascade of "OR" clauses connecting a series of "AND" clauses
that become progressively more selective following the sort order required by the Client and end up with the positioning
key.
To see how this works consider how the query for your Server might be developed...
Start with the column from the sort order that changes least often...
SELECT ...
FROM ...
WHERE Receive_Date < restart value
This will retrieve all rows prior to the specified restart Receieve date regardless of what the other
column restart values are (e.g. Customer ID's can range from minimum to maximum values, as long as the Receive Date
is less than any Receive Date "seen" so far). Since this column only changes value after all subortinate sort columns values
have been exausted you can be sure that this does not pick up any rows prior to the full restart key.
But what about those rows that occur on the same date as the restart request but have a
larger Customer Id? These can be picked up with....
SELECT ...
FROM ...
WHERE Receive_Date = restart value AND
Customer_id > restart value
What about those where the Receive Date and Customer Id are the same as the restart key but have
a larger Account Number? These can be picked up with...
SELECT ...
FROM ...
WHERE Receive_Date = restart value AND
Customer_Id = restart value AND
Account_Nbr > restart value
Continue this pattern until the full restart key has been processed. Notice that the inequality
signs are determined by the sort order. Use < when the column is sorted Descending and > when Ascending.
Also notice that the SELECT and FROM clauses
are exactly the same for each query - which means you can put them all together using OR conjuctions...
SELECT Receive_Date, Customer_id, Account_Nbr, Security_Dep_Id
FROM Table_Name
WHERE ( (Receive_Date < :RESTART-RCV-DT)
OR (Receive_Date = :RESTART-RCV-DT AND
Customer_Id > :RESTART-CUSTOMER-ID)
OR (Receive_Date = :RESTART-RCV-DT AND
Customer_Id = :RESTART-CUSTOMER-ID AND
Account_Nbr > :RESTART-ACCT-NBR)
OR (Receive_Date = :RESTART-RCV-DT AND
Customer_Id = :RESTART-CUSTOMER-ID AND
Account_Nbr = :RESTART-ACCT-NBR AND
Security_Dep_Id > :RESTART-SEC-DEP-ID))
ORDER BY 1 DESC, 2 ASC , 3 ASC, 4 ASC
There you go... a restartable cursor for forward paging. Construction of the cursor for backward paging follows a similar pattern, just flip the
sort orders and repeat.
A simplistic approach: Write your SQL to retrieve data according to your criteria, in the sort order you specify. Then only retrieve the keys to the rows you want. Save the keys somewhere you will have access to upon subsequent invocations of your transaction. Look into multi-row select in DB2. Also understand pseudo-conversational programming techniques in CICS.
And now we get to the design implications Bill Woodger mentions, that you do not specify in your question, and which are the reason I'm just hitting the high points of a simplistic approach.
If changes to your result set occur between one invocation and the next, your results will not reflect those changes. You must decide if this is important.
You mention a "front end" but do not specify what it is. If it is a BMS application, you may be able to save the keys in your commarea or in a container. If your front end is a distributed application invoking your transactions via CICS Web Services or CICS Web Support or MQ or raw sockets or whatever, you must design a mechanism to store those keys such that you can uniquely retrieve them — perhaps by sending a contrived key back to the distributed application which it must supply upon subsequent invocations. Then you must have some process to clean up your key store.
Creating a solution to your problem that is unique in your IT shop is not something to be done in isolation. You must involve others who will be tasked with maintaining your application, there may be a group external to your project tasked with making such decisions, there may be infrastructure issues with your solution.
So this isn't so much as an answer to your question as it is an elaboration upon why you may not get an answer, or at least the answer you seem to desire.