I've read the Core Data references on Apple's site. But I wonder, what they mean with atomic store types? They write something of "they have to be read and written in their entirety".
Could somebody clarify this, please?
Atomicity refers to the property of an entire operation happening before another operation can interrupt it-i.e. for that thread an "atomic" operation will complete if started. In real life this usually means checking the value of something at the start and end of an operation, and if that value changed apart from what the atomic operation performed re-attempt the operation.
It means that they cannot be in a state where they are half-written.
Related
We would like to be able to read state inside a command use case.
We could get the state from event store for the specific aggregate, but what about querying aggregates by field(not id) or performing more complicated queries, that are not fitted for the event store?
The approach we were thinking was to use our read model for those cases as well and not only for query use cases.
This might be inconsistent, so a solution could be to have the latest version of the aggregate stored in both write/read models, in order to be able to tell if the state is correct or stale.
Does this make sense and if yes, if we need to get state by Id should we use event store or the read model?
If you want the absolute latest state of an event-sourced aggregate, you're going to have to read the latest snapshot (assuming that you are snapshotting) and then replay events since that snapshot from the event store. You can be aggressive about snapshotting (conceivably even saving a snapshot after every command), but you're giving away some write performance to make the read faster.
Updating the read model directly is conceivably possible, though that level of coupling is something that should be considered very carefully. Note also that you will very likely need some sort of two-phase commit to ensure that the read model is only updated when the write model is updated and vice versa. I strongly suggest considering why you're using CQRS/ES in this project, because you are quite possibly undermining that reason by doing this sort of thing.
In general, if you need a query for processing a particular command, it's likely that query will generally be the same, i.e. you don't need free-form query support. In that case, you can often have a read model that's tuned for exactly that query and which only cares about events which could affect that query: often a fairly small subset of the events. The finer-grained the read model, the easier it is to keep in sync (if it ignores 99% of events, for instance, it can't really fall that far behind).
Needing to make complex queries as part of command processing could also be a sign that your aggregate boundaries aren't right and could do with a re-examination.
Does this make sense
Maybe. Let's start with
This might be inconsistent
Yup, they might be. So what?
We typically respond to a query by sending an unlocked copy of the answer. In other words, it's possible that the actual information in the write model will change after this response is dispatched but before the response arrives at its destination. The client will be looking at a copy of the answer taken from the past.
So we might reasonably ask how much better it is to get information no more than one minute old compared to information no more than five minutes old. If the difference in value is pennies, then you should probably deploy the five minute version. If the difference is millions of dollars, then you're in a good position to negotiate a real budget to solve the problem.
For processing a command in our own write model, that kind of inconsistency isn't usually acceptable or wise. But neither of the two common answers require keeping the read and write models synchronized. The most common answer is to just work with the write model alone. The less common answer is to grab a snapshot out of a cache, and then apply any additional events to it to bring it up to date. The latter approach is "just" a performance optimization (first rule: don't.)
The variation that trips everyone up is trying to process a command somewhere else, enforcing a consistency rule on our data here. Once again, you need a really clear picture of how valuable the consistency is to the business. If it's really important, that may be a signal that the information in question shouldn't be split into two different piles - you may be working with the wrong underlying data model.
Possibly useful references
Pat Helland Data on the Outside Versus Data on the Inside
Udi Dahan Race Conditions Don't Exist
I had some questions about the IReliableCollection.ClearAsync method. I could not find the answers for this in the documentation.
Can we assume it is an atomic operation?
What happens if the node hosting the partition with the reliable collection crashes during, or right after the method is called? Is it possible for the collection to contain a subset of the older items due to such crashes?
Appreciate any help!
The idea of reliable collections is that all operations, absent exceptions, are persistent and replicated. The confusion is probably arising from the fact that ClearAsync doesn't take a ITransaction parameter (doesn't support undoing), but nothing in the documentation indicates it offers anything less than the strong consistency guarantees of every other operation.
In short, the answer to your first question is yes, you can consider it atomic; the second question follows.
Source.
Can someone explain to me, whats the difference between atomic operations and atomic transactions? Its seems to me that these two are the same thing.Is that correct?
The concept of Atomicity is common between atomic transactions and atomic operations, but they are usually related to different domains.
Atomic Transactions are associated with Database operations where a set of actions must ALL complete or else NONE of them complete. For example, if someone is booking a flight, you want to both get payment AND reserve the seat OR do neither. If either one were allowed to succeed without the other also succeeding, the database would be inconsistent.
Atomic Operations on the other hand are usually associated with low-level programming with regards to multi-processing or multi-threading applications and are similar to Critical Sections.
For example, if two threads both access and modify the same variable, each thread goes through the following steps:
Read the variable from storage into local memory.
Modify the value in local memory.
Write the modified value back to the original storage location.
But in a multi-threaded system an interrupt or other context switch might happen after the first process has read the value but has not written it back. The second process (or interrupt) will then read and modify the OLD value and write its modified value back to storage. When the first process is re-enabled, it doesn't know that something might have changed so it writes back its change to the original value. Hence the operation that the second process did to the variable will be lost.
If an operation is atomic, it is guaranteed to complete without being interrupted once it begins. This is usually accomplished using hardware-level primitives like Test-and-Set or Compare-and-Swap.
To get a wider picture, you can take a look at:
MySQL Transactions and Atomic Operations
Atomicity (database systems)
Atomicity (Programming)
Some quotes from the above-cited resources:
About databases:
In an atomic transaction, a series of database operations either all
occur, or nothing occurs. A guarantee of atomicity prevents updates to
the database occurring only partially, which can cause greater
problems than rejecting the whole series outright. In other words,
atomicity means indivisibility and irreducibility.
About programming:
In concurrent programming, an operation (or set of operations) is
atomic, linearizable, indivisible or uninterruptible if it appears to
the rest of the system to occur instantaneously. Atomicity is a
guarantee of isolation from concurrent processes. Additionally, atomic
operations commonly have a succeed-or-fail definition — they either
successfully change the state of the system, or have no apparent
effect.
I have seen the word transaction used more often for databases and operation in programming, especially in kernel-level programming.
In a statement:
an atomic transaction is the smallest set of operations to perform the required steps.
Either all of those required operations happen(successfully) or the atomic transaction fails.
An atomic operation usually has nothing in common with transactions. To my knowledge this comes from hardware programming, where an set of operations (or one) happen to get solved instantly.
when dealing with mongodb, when should i use the {safe: true} on queries?
Right now I use the 'safe' option just to check if my queries were inserted or updated successfully. However, I feel this might be over kill.
Should i assume that 99% of the time, my queries (assuming they are properly written) will be inserted/updated, not have to worry about checking if they successfully inputted?
thoughts?
Assuming when you say queries you actually mean writes/inserts (the wording of your question makes me think this) then the Write Concern (safe, none, fsync, etc) can be used to get more speed and less safety when that is acceptable, and less speed and more safety when that is necessary.
As an example, a hypothetical Facebook-style application could use an unsafe write for "Likes" while it would use a very safe write for password changes. The logic behind this is that there will be many thousand "Like"-style updates happening a second, and it doesn't matter if one is lost, whereas password updates happen less regularly but it is essential that they succeed.
Therefore, try to tailor your Write Concern choice to the kind of update you are doing, based upon your speed and data integrity requirements.
Here is another use case where unsafe writes are an appropriate choice: You are making a large number of writes in very short order. In this case you might perform a number of writes, and then call get last error to see if any of them failed.
collection.setWriteConcern(WriteConcern.NORMAL)
collection.getDB().resetError()
List<
for (Something data : importData) {
collection.insert(makeDBObject(data))
}
collection.getDB().getLastError(WriteConcern.REPLICAS_SAFE).throwOnError()
If this block succeeds without an exception, then all of the data was inserted successfully. If there was an exception, then one or more of the write operations failed, and you will need to retry them (or check for a unique index violation, etc). In real life, you might call getLastError every 10 writes or so, to avoid having to resubmit lots of requests.
This pattern is very nice for performance when performing bulk inserts of large amounts of data.
Safe is only necessary on writes, not reads. Queries are only reads.
I've read this article on JPA concurrency, but either I am too thick or it is not explicit enough.
I am looking to do a database-controlled atomic update-if-found-else-insert operation (an UPSERT).
It looks to my poor slow brain that I can--within a transaction of course--run a named query with a lock mode of PESSIMISTIC_WRITE, see if it returns any results, and then either a persist() or an update() afterwards.
What I am not clear on are the differences between doing this operation with a PESSIMISTIC_WRITE lock vs. a PESSIMISTIC_READ lock. I've read the sentences--I understand that PESSIMISTIC_READ is intended to prevent non-repeatable reads, and PESSIMISTIC_WRITE is...well, maybe I don't understand that one so well :-) --but underneath it's just a SQL SELECT FOR UPDATE, yeah? In both cases?
I am looking to do a database-controlled atomic update-if-found-else-insert operation (an UPSERT).
I'm maybe not answering exactly the whole question but if you want to implement the above without any race condition, you need IMO a table-level LOCK IN EXCLUSIVE MODE (not only rows). I don't know if this can be done with JPA. Maybe you could clarify what would be acceptable for you.
I have faced this kind of situation and found this:
Pessimistic Locking, that means locking of objects on transaction begin and keeping the lock during transaction is done by these 2 PessimisticLockModes:
- LockModeType.PESSIMISTIC_READ -->
entity can be read by other transactions but no changes can be made
- LockModeType.PESSIMISTIC_WRITE -->
entity can not be read or written by other transactions
link to the article
I am looking to do a database-controlled atomic
update-if-found-else-insert operation (an UPSERT).
INSERT .. ON DUPLICATE KEY UPDATE does that.