Last Updated Date: Antipattern? - triggers

I keep seeing questions floating through that make reference to a column in a database table named something like DateLastUpdated. I don't get it.
The only companion field I've ever seen is LastUpdateUserId or such. There's never an indicator about why the update took place; or even what the update was.
On top of that, this field is sometimes written from within a trigger, where even less context is available.
It certainly doesn't even come close to being an audit trail; so that can't be the justification. And if there is and audit trail somewhere in a log or whatever, this field would be redundant.
What am I missing? Why is this pattern so popular?

Such a field can be used to detect whether there are conflicting edits made by different processes. When you retrieve a record from the database, you get the previous DateLastUpdated field. After making changes to other fields, you submit the record back to the database layer. The database layer checks that the DateLastUpdated you submit matches the one still in the database. If it matches, then the update is performed (and DateLastUpdated is updated to the current time). However, if it does not match, then some other process has changed the record in the meantime and the current update can be aborted.

It depends on the exact circumstance, but a timestamp like that can be very useful for autogenerated data - you can figure out if something needs to be recalculated if a depedency has changed later on (this is how build systems calculate which files need to be recompiled).
Also, many websites will have data marking "Last changed" on a page, particularly news sites that may edit content. The exact reason isn't necessary (and there likely exist backups in case an audit trail is really necessary), but this data needs to be visible to the end user.

These sorts of things are typically used for business applications where user action is required to initiate the update. Typically, there will be some kind of business app (eg a CRM desktop application) and for most updates there tends to be only one way of making the update.
If you're looking at address data, that was done through the "Maintain Address" screen, etc.
Such database auditing is there to augment business-level auditing, not to replace it. Call centres will sometimes (or always in the case of financial services providers in Australia, as one example) record phone calls. That's part of the audit trail too but doesn't tend to be part of the IT solution as far as the desktop application (and related infrastructure) goes, although that is by no means a hard and fast rule.
Call centre staff will also typically have some sort of "Notes" or "Log" functionality where they can type freeform text as to why the customer called and what action was taken so the next operator can pick up where they left off when the customer rings back.
Triggers will often be used to record exactly what was changed (eg writing the old record to an audit table). The purpose of all this is that with all the information (the notes, recorded call, database audit trail and logs) the previous state of the data can be reconstructed as can the resulting action. This may be to find/resolve bugs in the system or simply as a conflict resolution process with the customer.

It is certainly popular - rails for example has a shorthand for it, as well as a creation timestamp (:timestamps).
At the application level it's very useful, as the same pattern is very common in views - look at the questions here for example (answered 56 secs ago, etc).
It can also be used retrospectively in reporting to generate stats (e.g. what is the growth curve of the number of records in the DB).

there are a couple of scenarios
Let's say you have an address table for your customers
you have your CRM app, the customer calls that his address has changed a month ago, with the LastUpdate column you can see that this row for this customer hasn't been touched in 4 months
usually you use triggers to populate a history table so that you can see all the other history, if you see that the creationdate and updated date are the same there is no point hitting the history table since you won't find anything
you calculate indexes (stock market), you can easily see that it was recalculated just by looking at this column
there are 2 DB servers, by comparing the date column you can find out if all the changes have been replicated or not etc etc ect

This is also very useful if you have to send feeds out to clients that are delta feeds, that is only the records that have been changed or inserted since the data of the last feed are sent.

Related

Data syncing with pouchdb-based systems client-side: is there a workaround to the 'deleted' flag?

I'm planning on using rxdb + hasura/postgresql in the backend. I'm reading this rxdb page for example, which off the bat requires sync-able entities to have a deleted flag.
Q1 (main question)
Is there ANY point at which I can finally hard-delete these entities? What conditions would have to be met - eg could I simply use "older than X months" and then force my app to only ever displays data for less than X months?
Is such a hard-delete, if possible, best carried out directly in the central db, since it will be the source of truth? Would there be any repercussions client-side that I'm not foreseeing/understanding?
I foresee the number of deleted's growing rapidly in my app and i don't want to have to store all this extra data forever.
Q2 (bonus / just curious)
What is the (algorithmic) basis for needing a 'deleted' flag? Is it that it's just faster to check a flag rather than to check for the omission of an object from, say, a very large list. I apologize if it's kind of a stupid question :(
Ultimately it comes down to a decision that's informed by your particular business/product with regards to how long you want to keep deleted entities in your system. For some applications it's important to always keep a history of deleted things or even individual revisions to records stored as a kind of ledger or history. You'll have to make a judgement call as to how long you want to keep your deleted entities.
I'd recommend that you also add a deleted_at column if you haven't already and then you could easily leverage something like Hasura's new Scheduled Triggers functionality to run a recurring job that fully deletes records older than whatever your threshold is.
You could also leverage Hasura's permissions system to ensure that rows that have been deleted aren't returned to the client. There is documentation and examples for ways to work with soft deletes and Hasura
For your second question it is definitely much faster to check for the deleted flag on records than to have to try and diff the entire dataset looking for things that are now missing.

Best practices for editing data using forms in ms-access

So I've started learning access due to necessity, as the person who was in charge of it passed way and someone had to keep it going.
I noticed a very bad (at least IMO) behavior in all databases he created: Every single form was bound directly to a table or saved query. This way, if the user opened a form, he had to complete all the steps he was supposed to do, because if he closed the form mid process (or the computer froze, or anything of the sort), the actual data would be compromised as it would be half complete. This often times broke everything in the process chain, rendering sub-sequential steps impossible to be performed and forced me to correct data manually directly in the tables.
As I've start upgrading his stuff and developing my own, I've been trying to learn ways to allow the data to be edited in the form only, making it possible to cancel the process anytime or save the changes all at once in the end.
If the editions were simple, I discovered that I could create a recordset, copy relevant data to unbound fields in the form and, in the end, if the user chose to, copy the data from the form fields back to the recordset.
Other times more complex solutions were required, as I would need to edit several pieces of data at once in continuous forms, "save" them, run more code, maybe add fields to hold the information originated from that processing and so on. I then learned about using temporary tables, but did not like it, since it tended to bloat the db. I even went on to creating temporary databases during code execution that would host the temporary tables and be destroyed in the end, but that added too much unnecessary complexity.
Nowadays I'm using disconnected ADO recordsets to hold the temporary data and fields. It works but has its limitations.
So I'm wondering, what is the best way you - much more experienced than me - guys use to approach this kind of scenario? Is using in memory ADO recordsets really the best way around?
I think you are mixing two things that a form does that have completely different requirements. Editing existing records (and bound forms are great for that) and creating new records (where using a straight bound form can result in creating incomplete records). The way to approach it depends on many things but mainly to how much data is necessary for a new record to be considered "complete".
I usually do one of the following things:
Create an unbound popup modal form for adding new records where only the necessary fields are present. Once complete it loads the new record to the main form for further editing.
Use the above method except the form is not a popup one but a set of unbound fields in the footer or header of the main form.
Let the user create new records but bind validation on the OnClose (and/or other appropriate to your situation) event of the form that deletes the half-filled record if it does not validate.
Let users create new records in the bound form but have a 'cleanup' routine called either on a schedule or based on user actions.
Ultimately if your business process requires the majority of fields to be manually added/edited every time a new record is added or edited, you are better off using an unbound form.
This way, if the user opened a form, he had to complete all the steps
he was supposed to do, because if he closed the form mid process (or
the computer froze, or anything of the sort), the actual data would be
compromised as it would be half complete
No, if the computer freezes, then no data is saved to the table. This is the same if you used a disconnected reocrdset and a un-bound form.
If you use the before update event in the form that has some verification code and does a simple cancel = true, then the forms data is not saved nor is the table updated. Again, if you used a dis-connected record set and the user closes the form, you have to test the data – and again you can either choose to write out the data or not – this effect is ZERO difference from using a bound form to a table or a disconnected form.
If the editions were simple, I discovered that I could create a
recordset, copy relevant data to unbound fields in the form and, in
the end, if the user chose to, copy the data from the form fields back
to the recordset.
No you don’t need to do the above. The above achieves nothing and only racks up additional development hours and increases cost of the application. In near all cases in-bound forms increase development costs over that of a simple form bound to a table. So the original developer had the correct idea. You can control the update of the underlying table in near all cases to achieve the required verification. Forms only save and write the data out if the developer allows as such.
So Access forms when bound no more or less write incomplete data out to a table if you place verification code in the forms before update event. A half-filled bound form, or a half filled un-bound form with dis-connected reocrdset BOTH will not write their data if the computer freezes.
And BOTH types of forms will not write out data to table until such time your verification code has completed.
Access is not designed for un-bound forms, and tools like vb.net, or even VB6 had a whole bunch of cool wizards and support for un-bound forms. In access, we don't have those wizards. And when you use UN-bound forms then you loose tons of form events. You in effect get the worst of both worlds, since you lose use of form events and have no wizards or support for un-bound. Even just the several delete record events we have are rather amazing.
You lose use of me.dirty, on-insert, me.newReocrd, forms after update events - the list of features you toss out and lose is huge. And if you want a button to write data to the table (such as a save button on the form), then just go:
If me.Dirty = True then
me.Dirty = False ' this forces your verification code to run
End if
There are FEW use cases in which in-bound forms will benefit you, but they will cost you rather much in terms of development times.

ms access all the data in my table does not show up in my form

I hope my question makes sense, I'll try to give as much info as possible.I should probably start off by saying this is the first access database (any database) I have ever done and my knowledge comes from trial and error as well as youtube and the occasional google search...NOOB
So I'm attempting to build a database using microsoft access (2007) for the first time (Student Records in my department). I have pulled in all the data I had available (names, major, graduate, advisor etc.) and made several appended tables for additional data using an append query (usually just pulling over name and ID# and major, and then adding the information that is related to the particular table).
Now I am going through the paper files (which we would like to get rid of) to update any missing data or add new students that we didn't have stored anywhere electronically.
I have created a form in which I can add new records or edit/add already available data that I need.
The problem that I have is that it pretty much pulls up everything I need except the occasional record (which I do a search in the search field on the bottom using the ID#) so I figure hey I must not have this student and add it, when I hit save it basically tells me this record can't be added as there already is a conflicted value. And when I check my table sure enough the record is there. In the form query where I check what tables the field's information is pulled from I have no criteria in there to filter any information out, the relationships overall are just based on the ID# (which is my primary key in all tables). When I check the data everything seems to be correct (not a wrong major, etc.) so I can't quite figure out why some records are not being pulled up.
My question is why and what can I do to fix it...
I hope my explanation is not to confusing. Thank you in advance.

When a teacher add an assignment, all the student names appear. How to do it?

I have a task to create a database to track student results in a school. I came out with a set of relationships between the tables according to the 3 forms of normalisation(I hope I got it right. If not, please enlighten me).
One feature that I want to put in the Filemaker app is that when a teacher want to enter some assignment marks, he will just need to create a new submission record and all the student names in the class will appear.
I could not think how this feature can be done in Filemaker. I can only create a new submissions record and key in a student's score, then create another new record to do the same thing for a second student.
Can someone help? I am a teacher, not a Filemaker developer so please correct me if my database tables are done wrongly.
Update:
I will like the output to be like this
Spreadsheet is not suitable because it can't be used to search/sort easily.
I have a quick sample file here. It's an old sample and it uses a different (but similar) model. Basically the idea is that: You have a calculated field (I use a repeating field) to display the data. You also have a global repeating field that serves as an editing widget. Each time you go to a record you fill this field's reps with data from related records (using a OnRecordLoad trigger). This doesn't mean the field shows the same data for all records, because its conditional formatting rules are set to hide all data; so it only shows a piece of data when you actually enter one of its repetitions. This is the data that can be edited. And finally there's a trigger that fires each time you exit the field and posts your changes to the related table (adds, updates, or deletes).
The sample isn't quite complete because if there's fewer data columns than repetitions, you'd probably want to somehow lock the remaining repetitions; this part isn't done. Otherwise it works fairly well. In FM 12, however, it tends to freeze the app; I reported this to FMI, they acknowledged it, but I don't think it has been fixed already.

How do you manage concurrent access to forms?

We've got a set of forms in our web application that is managed by multiple staff members. The forms are common for all staff members. Right now, we've implemented a locking mechanism. But the issue is that there's no reliable way of knowing when a user has logged out of the system, so the form needs to be unlocked. I was wondering if there was a better way to manage concurrent users editing the same data.
You can use optimistic concurrency which is how the .Net data libraries are designed. Effectively you assume that usually no one will edit a row concurrently. When it occurs, you can either throw away the changes made, or try and create some nicer retry logic when you have two users edit the same row.
If you keep a copy of what was in the row when you started editing it and then write your update as:
Update Table set column = changedvalue
where column1 = column1prev
AND column2 = column2prev...
If this updates zero rows, then you know that the row changed during the edit and you can then deal with it, or simply throw an error and tell the user to try again.
You could also create some retry logic? Re-read the row from the database and check whether the change made by your user and the change made in the database are able to be safely combined, then do so automatically. Or you could present a choice to the user as to whether they still wish to make their change based on the values now in the database.
Do something similar to what is done in many version control systems. Allow anyone to edit the data. When the user submits the form, the database is checked for changes. If the record has not been changed prior to this submission, allow it as usual. If both changes are the same, ignore the incoming (now redundant) change.
If the second change is different from the first, the record is now in conflict. The user is presented with a new form, which indicates which fields were changed by the conflicting update. It is then the user's responsibility to resolve the conflict (by updating both sets of changes), or to allow the existing update to stand.
As Spence suggested, what you need is optimistic concurrency. A standard website that does no accounting for whether the data has changed uses what I call "last write wins". Simply put, whichever connection saves to the database last, that version of the data is the one that sticks. In optimistic concurrency, you use a "first write wins" logic such that if two connections try to save the same row at the same time, the first one that commits wins and the second is rejected.
There are two pieces to this mechanism:
The rules by which you fail the second commit
How the system or the user handles the rejected commit.
Determining whether to reject the commit
Two approaches:
Comparison column that changes each time a commit happens
Compare the data with its committed version in the database.
The first one entails using something like SQL Server's rowversion data type which is guaranteed to change each time the row changes. The upside is that it makes it simple to roll your own logic to determine if something has changed. When you get the data, you pull the rowversion column's value and when you commit, you compare that value with what is currently in the database. If they are different, the data has changed since you last retrieved it and you should reject the commit otherwise proceed to save the data.
The second one entails comparing the columns you pulled with their existing committed values in the database. As Spence suggested, if you attempt the update and no rows were updated, then clearly one of the criteria failed. This logic can get tricky when some of the values are null. Many object relational mappers and even .NET's DataTable and DataAdapter technology can help you handle this.
Handling the rejected commit
If you do not leave it up to the user, then the form would throw some message stating that the data has changed since they last edited and you would simply re-retrieve the data overwriting their changes. As you can imagine, users aren't particularly fond of this solution especially in a high volume system where it might happen frequently.
A more sophisticated (and also more complicated) approach is to show the user what has changed allow them to choose which items to try to re-commit, Behind the scenes you would retrieve the data again, overwrite the values picked by the user with their entries and try to commit again. In high volume system, this will still be problematic because by the time the user has tried to re-commit, the data may have changed yet again.
The checkout concept is effectively pessimistic concurrency where users "lock" rows. As you have discovered, it is difficult to implement in a stateless environment. Users are notorious for simply closing their browser while they have something checked out or using the Back button to return a set that was checked out and try to recommit it. IMO, it is more trouble than it is worth to try go this route in a web-based solution. Assuming you write the user name that last changed a given row, with optimistic concurrency, you can inform the user whose changes are rejected who saved the data before them.
I have seen this done two ways. The first is to have a "checked out" column in your database table associated with that data. Your service would have to look for this flag to see if it is being edited. You can have this expire after a time threshold is met (with a trigger) if the user doesn't commit changes. The second way is having a dedicated "checked out" table that stores id's and object names (probably the table name). It would work the same way and you would have less lookup time, theoretically. I see concurrency issues using the second method, however.
Why do you need to look for session timeout? Just synchronize access to your data (forms or whatever) and that's it.
UPDATE: If you mean you have "long transactions" where form is locked as soon as user opens editor (or whatever) and remains locked until user commits changes, then:
either use optimistic locking, implement it by versioning of forms data table
optimistic locking can cause loss of work, if user have been away for a long time, then tried to commit his changes and discovered that someone else already updated a form. In this case you may want to implement explicit "locking" of form, where user "locks" form as soon as he starts work on it. Other user will notice that form is "locked" and either communicate with lock owner to resolve issue, or he can "relock" form for himself, loosing all updates of first user in process.
We put in a very simple optimistic locking scheme that works like this:
every table has a last_update_date
field in it
when the form is created
the last_update_date for the record
is stored in a hidden input field
when the form is POSTED the server
checks the last_update_date in the
database against the date in the
hidden input field.
If they match,
then no one else has changed the
record since the form was created so
the system updates the data.
If they don't match, then someone else has
changed the record since the form was
created. The system sends the user back to the form edit page and tells the user that someone else edited the record and they must reapply their changes.
It is very simple and works well enough.
You can use "timestamp" column on your table. Refer: What is the mysterious 'timestamp' datatype in Sybase?
I understand that you want to avoid overwriting existing data with consecutively updates.
If so, when the user opens a screen you have to get last "timestamp" column to the client.
After changing data just before update, you should check the "timestamp" columns(yours and db) to make sure if anyone has changed tha data while he is editing.
If its changed you will alert an error and he has to startover. If it is not, update the data. Timestamp columns updated automatically.
The simplest method is to format your update statement to include the datetime when the record was last updated. For example:
UPDATE my_table SET my_column = new_val WHERE last_updated = <datetime when record was pulled from the db>
This way the update only succeeds if no one else has changed the record since the last read.
You can message to the user on conflict by checking if the update suceeded via a SELECT after the UPDATE.