Skip some ranges in postgresql sequence? - postgresql

I want to skip some ranges in sequence:
Create sequence id_seq;
Consider I have a sequence as Id_seq.. And it starts from 100..
When it reaches to 199.. Then it should start with 1000 and when it reaches 1999 .. It should start with 10000..
setval(100,'Id_seq');
Whether postgres has any default configuration to do this?
Multiple process will use this sequence.. So assigning manually in process using setval() lead some difficulties..

No there is nothing built in to do this. I've never heard of anyone wanting to do this before.
If you really care about the numbers you get then a sequence isn't the right thing for you anyway. You can get gaps in it quite easily. It's designed to generate differing numbers without impacting concurrency.

Related

PostgreSQL Serialized Inserts Interleaving Sequence Numbers

I have multiple processes inserting into a Postgres (10.3) table using the SERIALIZED isolation level.
Another part of our system needs to read these records and be guaranteed that it receives all of them in sequence. For example, in the picture below, the consumer would need to
select * from table where sequanceNum > 2309 limit 5
and then receive sequence numbers 2310, 2311, 2312, 2313 and 2314.
The reading query is using READCOMMITTED isolation level.
What I'm seeing though is that the reading query is only receiving the rows I've highlighted in yellow. Looking at the xmin, I'm guessing that transaction 334250 had begun but not finished, then transactions 334251, 334252 et al started and finished prior to my reading query starting.
My question is, how did they get sequence numbers interleaved in those of 334250? Why weren't those transactions blocked by merrit of all of the writing transactions being serialized?
Any suggestions on how to achieve what I'm after? Which is, a guarantee that different transactions don't generate interleaving sequence numbers? (It's ok if there are gaps.... but they can't interleave).
Thanks very much for your help. I'm losing hair over this one!
PS - I just noticed that 334250 has a non zero xmax. Is that a clue that I'm missing perhaps?
The SQL standard in its usual brevity defines SERIALIZABLE as:
The execution of concurrent SQL-transactions at isolation level SERIALIZABLE is guaranteed to be serializable.
A serializable execution is defined to be an execution of the operations of concurrently executing SQL-transactions
that produces the same effect as some serial execution of those same SQL-transactions. A serial execution
is one in which each SQL-transaction executes to completion before the next SQL-transaction begins.
In the light of this definition, I understand that your wish is that the sequence numbers be in the same order as the “serial execution” that “produces the same effect”.
Unfortunately the equivalent serial ordering is not clear at the time the transactions begin, because statements later in the transaction can determine the “logical” order of the transactions.
Sequence numbers on the other hand are ordered according to the wall time when the number was requested.
In a way, you would need sequence numbers that are determined by something that is not certain until the transactions commit, and that is a contradiction in terms.
So I think that it is not possible to get what you want, unless you actually serialize the execution, e.g. by locking the table in SHARE ROW EXCLUSIVE mode before you insert the data.
My question is why you have that unusual demand. I cannot think of a good reason.

Postgres: incrementing a counter field

How does one increment a field in a database such that even if if a thousands connections to the database try to increment it at once -- it will always be 1000 at the end (if started from zero).
I mean, so that no two connections increment the same number number resulting in a lost increment.
How do you synchronize and make sure the data is consistent? is there a must for a stores procedure for this? database "locking"? how is that done?
What you're looking for is a Postgres SEQUENCE.
You call nextval('sequence_name') to get the next number in the sequence.
According to the docs,
sequences are designed with concurrency in mind:
To avoid blocking concurrent transactions that obtain numbers from the same sequence, a nextval operation is never rolled back; that is, once a value has been fetched it is considered used, even if the transaction that did the nextval later aborts. This means that aborted transactions might leave unused "holes" in the sequence of assigned values.
EDIT:
If you're looking for a gapless sequence, in 2006 someone posted a solution to the PostgreSQL mailing list: http://www.postgresql.org/message-id/44E376F6.7010802#seaworthysys.com. It appears there's also a lengthy discussion on locking, etc.
The gapless-sequence question was also asked on SO even though there was never an accepted answer:
postgresql generate sequence with no gap

Simulating an Oracle sequence with MongoDB

Our domain model deals with sales invoices, each of which has a unique, automatically generated number. When creating an invoice, our SalesInvoiceService retrieves a number from a SalesInvoiceNumberGenerator, creates a SalesInvoice using this number and a few other objects (seller, buyer, issue date, etc.) and stores it through the SalesInvoiceRepository. Since we are using MongoDB as our database, our MongoDbSalesInvoiceNumberGenerator uses a findAndModify command with $inc 1 on a given InvoicePolicies.nextSalesInvoiceNumber to generate this unique number, similar to what we would using an Oracle sequence.
This is working in normal situations. However, when invoice creation fails because of a broken business rule (e.g. invalid issue date), an exception is thrown and our InvoicePolicies.nextSalesInvoiceNumber has alreay been incremented. Obviously, since there is no transaction managing this unit of work, this increment is not rolled back, so we end up with lost invoice numbers. We do offer a manual compensation mechanism to the user, but we would like to avoid this sort of situation in the first place.
How would you deal with this situation? And no, switching to another database is not option :)
Thanks!
TL;DR: What you want is strict serializability, but you probably won't get it, unless you give up concurrency completely (then you even get linearizability, theoretically). Gap-free is easy, but making sure that today's invoice doesn't get a lower number than yesterdays is practically impossible.
This is tricky, or at least, very expensive. That is also true for any other data store, because you'll have to limit the concurrency of the application to guarantee it. Think of an auto-increasing stamp that is passed around in an office, but some office workers lose letters. Tricky... But you can reduce the likelihood.
Generating sequences without gaps is hard when contention is high, and very hard in a distributed system. Keeping a lock for the entire time the invoice is generated is usually not an option, though that would be easy. So let's try that:
Easiest way out: Use a singleton background worker, i.e. a single-threaded process that runs on a single machine. Have it explicitly check whether the current number is really present in the invoice collection. Because it's single-threaded on a single machine, it can't have race conditions. Done, via limiting concurrency.
When allowing concurrency, things get messy:
It might be best to use something like a two-phase commit protocol. Essentially, make the entire invoice creation process a long-running transaction, and store the pending transactions explicitly, i.e. store all numbers that haven't been used yet, but reserved.
Then track the completion status of each and every transaction. If a transaction hasn't finished after some timeout, consider that number available again. It's hard enough to add that to the counter code, but it's possible (check if a timed out transaction is present, otherwise get a new counter value).
There are several possible errors, but they can all be resolved. This is better explained in the link and on the net. Generally, getting the implementation right is hard though.
The timeout poses a problem, however, because you need to hard-code an assumption about the time it takes for invoices to be generated. That can be awkward close to day/month/year barriers, since you'll want to avoid creating invoice 12345 in 2015 and 12344 in 2014.
Even this won't guarantee gap free numbers for limited time intervals: if no more request is made that could use the gap number in the current year, you're facing a problem.
I wonder if using something like findAndModify and the new Transactions API combined could be used to achieve something like that while also accounting for gaps if ran within a transaction then? I haven't personally tried it, and my project isn't far along yet to worry about the billing system but would love to be able to use the same database for everything to make things a bit easier to operate.
One problem I would think is probably a write bottleneck but this should only take a few milliseconds I'd imagine and you could probably use a different counter for every jurisdiction or store like real life stores do. Then the cash register number could be part of it too, which I guess guess cash register numbers in the digital world could be the transaction processing server it went to if say you used microservices for example, so you could load balance round robin between them probably. That's assuming if it's uses a per document lock - which from my understanding it does possibly.
The only main time I'd probably worry about this bottleneck is if you had a very popular store or around black Friday where there's a huge spike or doing recurring invoices.

sequence in DB2 generates duplicate values

My application uses DB2 data base. I had created a sequence for my table to generate the primary key,it was working fine uptill today, but now it seems to be generating existing values and I am getting DuplicateKeyException while inserting values. After a bit of googling I found that we can reset the sequence again.
Could some one please help me with the best possible option as I have not worked with sequences much and the things to consider while going with that approach.
If I have to reset the sequence then what should be the best way to do it and again points to consider before doing so.Also it would be great to know what could be the reason behind the issue that I am facing so that I can take care of it in future.
Just for information the max value assigned while creating the sequence has not yet reached.
Thanks a lot in advance.
ALTER SEQUENCE SCHEMA.SEQ_NAME RESTART WITH NUMERIC_VALUE;
this was required in my case i.e restarting the sequence with a value higher then the current max value of the id field for which the sequence was being used.
The NUMERIC_VALUE denotes the value which is higher then current max value for my sequence generated field.
Hope it will be helpful for others.
The cause probably for this issue was manual insertion of records in the db.

What is the maximum NUM_OF_PARAMS in Perl DBI placeholders

What is the maximum number of placeholders is allowed in a single statement? I.e. the upper limit of attribute NUM_OF_PARAMS.
I'm experiencing odd issue where I try to tune the maximum number of multiple rows insert, ie set the number to 20,000 gives me an error because $sth->{NUM_OF_PARAMS} becomes negative.
Reducing the max inserts to 5000 works fine.
Thanks.
As far as I am aware the only limitation in DBI is that the value is placed into a Perl scalar so it is what can be held in that. However, for DBDs it is totally different. I doubt many, if any databases support 20000 parameters. BTW, NUM_OF_PARAMS is readonly so I've no idea what you mean by "set the number to 20,000". I presume you just mean you create a SQL statement with 20000 parameters and then read NUM_OF_PARAMS and it gives you a negative value. If the latter I suggest you report (with an example) that on rt.cpan.org as it does not sound right at all.
I cannot imagine creating a SQL statement with 20000 parameters is going to be very efficient in any database. Far better to try and reduce that to a range or something like it if you can. In ODBC, 20000 parameters would mean 20000 IPDs and APDs and they are quite big structures. Since DB2 cli library is very like ODBC I would imagine you are going to eat up loads of memory.
Given that 20,000 causes negative problems and 5,000 doesn't, there's a signed 16-bit integer somewhere in the system, and the upper bound is therefore approximately 16383.
However, the limit depends on the underlying DBMS and the API used by the DBD module for the DBMS (and possibly the DBD code itself); it is not affected by DBI.
Are you sure that's the best way to deal with your problem?