I have a table that generates integer values (currently on 10,230) and I have this list of accounts which I want to assign these integer values to but will not be used in prod yet.
I want to allocate 20 of the next values, (e.g. 10,231, 10,232, etc.) but what I want is for the next person that will make an insert into the prod table to not take those values which I allocate for those accounts. How can I prevent those values from being assigned to someone else?
Basically, I want people to start generating values from 10,251 onward as 10,230 - 10,250 are allocated by me.
Related
I'm facing the challenge of generating 'balance' values from thousands of entries in a PG table.
The rows in the table have many different columns, each useful in calculating that rows contribution to the balance. Each row/entry belongs to some profile. I need to calculate the balance value for some profile, from all entries belonging to that profile according to some set of rules. Complexity should be O(N) - N being the number of entries that belong to the profile.
The different approaches I took:
Fetching the rows, calculating balances on backend. This degrades doesn't scale well and degrades quickly, depending on the number of entries that belong to the profile. While fetching the entries is initially fast, once a profile has over 10,000 entries it becomes prohibitively slow.
I figured that a lot of time is being spent on transport, additionally we don't really need the rows only the balances. Since we already do the work of finding the entries, we can calculate the balance and save time on backend calculations as well as the transport of thousands of rows, thus leading to the second approach:
The second approach was creating a PG query that iterates over the rows and calculates the balance. This has proven to be more scalable when there are many entries per profile. This approach however, probably due to the complexity of the PG query, puts a lot of load on the database. It's enough to run 3-4 of these queries concurrently to max out the database CPU.
the third approach is to create a PL/pgSQL function to loop over the relevant entries and return the rows, hoping to reduce the impact on the database. This is the next thing I want to try.
Main question is - what would be the most efficient way to achieve this while being 'database friendly'?
Additionally:
Whether you think these approaches are sane?
Am I missing another obvious solution?
Is it unlikely that I improve over the performance of the query with the help of a function looping over same rows as the query, or is worth trying?
I realize I haven't provided a lot of concrete data, but I figured that since this is probably a common problem, maybe the issue can be understood from a general description.
EDIT:
To be a but more specific, I'm dealing with the following data:
CREATE TABLE entries (
profileid bigint NOT NULL,
programid bigint NOT NULL,
ledgerid text NOT NULL, -- this provides further granularity, on top of 'programid'
startdate timestamptz,
enddate timestamptz,
amount numeric NOT NULL
)
What I want to get is the balances for a certain profileid, separate by (programid, ledgerid).
The desired form is:
RETURNS TABLE (
programid bigint,
ledgerid text,
programid bigint,
currentbalance numeric,
pendingbalance numeric,
expiredbalance numeric,
spentbalance numeric
)
The four balance values are produced by applying arithmetic on certain entries. For example, negative amount would only add to spentbalance, expired balance is generated from entries that have a positive amount and the enddate is after now(), etc...
While I did manage to create a very large aggregate query with many calls to COALESCE(SUM(CASE WHEN ... amount), 0), I was wondering if I have anything to benefit from porting that logic into a PL/pgSQL function. However, when trying to implement this function I realized I don't know how to iterate over one function and return another, different in columns and rows, function. Should I use a temp table for this? Seems like an overkill as this query is expected to execute tens of times every second...
We often have columns that can contain values of varying sizes. For these, I like to set the data type to VARCHAR with a size way beyond the current maximum length. For example, if I have a column where the current minimum length for a value is 10 and the maximum length is 35, I might set the data type to VARCHAR(64). My rationale is that Db2 stores the 2 byte length followed by the exact value, therefore, there is no difference, from a storage perspective, defining the data type as VARCHAR(64) instead of VARCHAR(35). And I don't get an error if I a value with a length of 36 comes along.
Is there a nuance that I'm missing and should I not be so glib about my VARCHAR assignments?
The exact formula to calculate row length is described in the docs for CREATE TABLE. VARCHAR(64) or VARCHAR(35) should not make a difference.
Be aware that rows a stored in data pages in tablespaces. Database systems usually pre-allocate pages for performance reasons. Moreover, pages might not be fully filled or there is compression. And you might have defined indexes which require their own pages with structures. Plus there is metadata in the system catalog.
I have two Coldfusion templates (getdata.cfm and generate.cfm). The first template getdata.cfm will retrieve from a database with a query, in addition to other tasks. It will retrieve exactly 16 rows of data and each row will have 8 fields. Such as this:
<cfquery datasource="xyz name="lista">
SELECT n1,n2,n3,n4,n5,n6,n7,n8
FROM atable
WHERE product = "abc"
ORDER BY date DESC LIMIT 16
</cfquery>
The second template will generate some random numbers and compare them against these 16 rows. There is a Refresh button on the second template to regenerate the numbers. This is how I would like it to work.
However, right now the only way for it to work is having the database <cfquery> in the second template, generate.cfm That means every time I press Refresh, it will access the database, retrieve the same 16 rows every time, and generate the random numbers. This is not ideal. Because the 16 rows are the same, it makes no sense to retrieve them every time a new set of random numbers get generated. It would be best to get them once, in the first template and somehow pass them to the second template. The 16 sets of numbers will need to be displayed on the screen at all times. The matched and unmatched numbers need to be shown.
How can I pass the whole query result from the first template to the second one without having to pass the 16 records as 16 lists via the form as form fields? Is this even possible? Thanks in advance.
Generate a set of random numbers.
Compare those numbers against a static set of data.
Repeat.
Do you need to do the comparison in the application (CFML) code? Can you generate the set of random numbers and send them to the DB as part of the query in a single request? That way, you get the records from the DB that match your set of numbers and not all 16.
Then every refresh would send the new set of random numbers to the DB, returning only the relevant data.
Alternately, you can use cfquery with the cachedWithin attribute in order to store the results of the query into memory for a specific amount of time while refreshing your random set of numbers.
https://cfdocs.org/cfquery
I can do this using multiple commands in C# for the app I'm creating, but prefer a stored proc to eliminate issues with latency/locks, etc. (hopefully):
I have a table of 10 extensions (important fields):
SortOrder, Extension, IsUsed
First record will be set to IsUsed = true
When calling the stored proc, I need the IsUsed of the NEXT record in sort order to be set to true,the current record that is true set to false. When I hit the last record, rotate back to the first record.
Use Case: I need to rotate through a bank of usable numbers. Multiple people use the app, so cannot reuse. a number within the last 4 minutes (Bank of 10 will suffice, but we can extend if necessary). When the user requests a number, they get the next avail. I can build the table however needed, so any and all options to achieve use case are welcome.
I need to set the flag to true on the 1st record when stored proc is called. All other records should be false.
I have seen this, which is of interest, but doesn't quite answer:
Get "next" row from SQL Server database and flag it in single transaction
If all that you're using this for is to return a number to identify a session, I'd suggest scrapping the whole table idea and letting SQL Server do the work for you.
You can create a SEQUENCE object that will cycle and return the next value for you, without needing to write any code or maintain any tables.
CREATE SEQUENCE dbo.Extension
AS integer
START WITH 5
INCREMENT BY 5
MINVALUE 5
MAXVALUE 50
CYCLE;
This will return the number 5 the first time it's called, up to the number 50 on call number 10, and then start back over. You can adjust the numbers in the code to more or less do whatever you would like, though.
Get the next value like this:
SELECT NEXT VALUE FOR dbo.Extension;
And when/if you need to extend the range:
ALTER SEQUENCE dbo.Extension
MAXVALUE 100;
Play around with the idea on the Rextester demo.
Edit: In light of the comments above and below, I'd still stick with a SEQUENCE, I think.
Every time your code calls the table for an extension, use a query along the lines of this:
SELECT
Extension
FROM
ExtTable
WHERE
SortOrder = NEXT VALUE FOR dbo.Extension;
Functionally, this should do what you're after, again with no code to write or maintain.
This is the data that comes back from the database
Data Sample for one season (the report returns values for two):
What you can see is groupings, by Season, Theater then Performance number and lastly we have the revenue and ticket columns.
The SSRS Report Has three levels of groupings. Pkg (another ID that groups the below), venue -- the venue column and perf_desc -- the description column linked tot he perf_no.
Looks like this --
What I need to do is take the revenue column (a unique value) for each Performance and return it in a separate column -- so i use this formula.
sum(Max(Fields!perf_tix.Value, "perf_desc"))
This works great, gives me the total unique value for each performance -- and sums them up by the pkg level.
The catch is when i need to pull the data out by season.
I created a separate column looks like this
it's yellow because it's invisible and is referenced elsewhere. But the expression is if the Season value = to the Parameter (passed season value) -- then basically pull the sum of each of the tix values and sum them up. This also works great on the lower line - the line where the grouping exists for pkg -- light blue in my case.
=iif(Fields!season.Value = Parameters!season.Value, Sum(Max(Fields!perf_tix.Value, "perf_desc")), 0)
However, the line above -- the parent/header line its giving me the sum of the two seasons values. Basically adding it all up. This is not what I want and also why is it doing this. The season value is not equal to the passed parameter for the second season value so why is it adding it to the grouped value.
How do I fix this??
Since your aggregate function is inside your IIF function, only the first record in your dataset is being evaluated. If the first one matches the parameter, all records would be included.
This might work:
=IIF(Fields!season.Value = Parameters!season.Value, Sum(Max(Fields!perf_tix.Value, "perf_desc")), 0)
It might be better if your report was also grouping on the Venue, otherwise you count may include all values.