I have an Atom document where the atom:updated element date is that of the request made to the server. Therefore it changes on each request.
My initial thoughts were that this was incorrect and would prefer to use the date of the latest published item.
So I checked the Atom documentation for atom:updated:
The "atom:updated" element is a Date construct indicating the most recent instant in time when an entry or feed was modified in a way the publisher considers significant. Therefore, not all modifications necessarily result in a changed atom:updated value.
atomUpdated = element atom:updated { atomDateConstruct }.
Publishers MAY change the value of this element over time.
I believe this backs-up my thoughts, but maybe there is a case for using the request date in certain circumstances too.
So what date is used in atom:updated?
So what date is used in atom:updated?
The specification you cite seems to answer that question:
The "atom:updated" element is a Date construct indicating the most recent instant in time when an entry or feed was modified in a way the publisher considers significant.
As a publisher, if you consider the document to be under constant significant modification then it would be compliant with the specification to return the current time.
However, it's more likely that consumers would expect that date's change to reflect specific changes.
Related
I've stumbled on this page in PostgreSQL wiki, where it's advised to not use BETWEEN with timestamps:
Why not?
BETWEEN uses a closed-interval comparison: the values of both ends of
the specified range are included in the result.
This is a particular problem with queries of the form
SELECT * FROM blah WHERE timestampcol BETWEEN '2018-06-01' AND
'2018-06-08'
This will include results where the timestamp is exactly 2018-06-08
00:00:00.000000, but not timestamps later in that same day. So the
query might seem to work, but as soon as you get an entry exactly on
midnight, you'll end up double-counting it.
Can anyone explain how this "double-counting" can occur?
Often when using BETWEEN, you want to use multiple ranges (in separate queries) which cumulatively cover all the data.
If the next invocation of your query uses BETWEEN '2018-06-08' AND '2018-06-015', then exact midnight of 2018-06-08 will be included in both invocations. On the other hand, if the next one uses BETWEEN '2018-06-09' AND '2018-06-015', then all of 2018-06-08 except for exact midnight has been overlooked. There is no easy way (using BETWEEN) to construct ranges which cover every data point once and only once.
In my KafkaStreams app I have a registered local store (simple counters) that is updated in the transform method.
In the punctuate method I basically loop over the KV-store and push all the data to the output topic (so even if the value hasn't been updated).
One idea is to store the update timestamp for every key and forward only records updated since the last punctuate call.
But I think there should be a more convenient solution for that.
How to make this more performant and forward updated entries only?
As indicated in the comments from Matthias, keeping track of updated the records is not supported at the moment.
Your approach of updating a timestamp in the value (or creating a "Value Wrapper" object that contains a timestamp you can modify) and checking if an update has occurred since the last punctuate call is a valid approach.
-Bill
I'm starting to play around with the smartsheet api. I've been ask to retrieve the version history of the different rows of a sheet. I can see that the api return a version number with the row, but I can't find any documentation on how to retrieve all of one particular version of a row. Is it possible?
Since individual cells might be edited more or fewer times than another cell in the same row, it's not really feasible to specify a specific revision number. You would need to pull the complete history for each cell, and then do your own filtering logic to match whatever criteria you're looking for.
http://smartsheet-platform.github.io/api-docs/#get-cell-history
According to the API documentation, there is no filter criteria parameter you can specify, so this logic must be done yourself. For instance, you might filter by the last revision that happened before a certain modifiedAt value.
I was looking at the diagram for a sync operation in Kinto docs and i have a doubt.
Why does .sync() is a pull.then(push).then(pull) instead of just pull.then(push)?
What do we need the last pull for?
When you do your push you will update the records last_modified value, so at the end you will need to grab the new last_modified value of the list.
You may also have got some changes on the collection while you were pushing your changes (done by another device).
pulling after the push will let you grab the new last_modified value, the changes that you've made as well as the changes that were made in the meantime.
At this point you might think that grabbing the changes you've made is a bit silly (because you already know what you just pushed). It is basically the subject of this issue.
The idea is that you can also try to pull with the last_modified value of your last update as a If-Match header using the last_modified value of the collection before your changes as a _since parameter and excluding all the records IDs you've changed.
In that case you will get a 304 most of the time or a list of changes that were made by others while you were doing your push.
I'm trying to implement a syn using caldav and sync reports however I'm having conceptual problems about how to sync one calendar (one VEVENT) between multiple clients and the server.
Most rfc's refer to the use of the etag to determine if a resource has changed since it was last synced. (If the etag changes, the resource has changed since last sync). That I get. However how do I know which change is more recent?
For example client A has an ical 'X' that was last edited at 1AM and they sync at 8AM. Client B also has a version of ical X, that they edited at 2AM and sync at 7AM. So B is newer then A and B synced before A.
When A syncs it will see B's newer version of X. From the etag it knows that X has changed but not 'when'. I'm assuming that A should overwrite with B, since B is newer (or at least be able to prompt the user saying B is newer).... is this assumption correct / is there a standard way to handle this situation?
The problem in general is when trying to figure out what file is newer between the server and a client. The etag only can detect 'changed' but not 'newer'. The last modified date seems to reflect the icals upload date and not its last edit date on the client. This leads me to believe I'm missing something. Is there some generally accepted algorithm for syncing?
The last edit date is just one piece of the equation here. More meaningful is the actual modification. You might have turned off an alarm from device B (insignificant change) but changed the start date from device A (major change). So, a well behaved client should make its best effort at trying to merge the two.
Some clients will just notify you that the event had been edited and will ask you which copy to keep but without a side by side comparison UI, this is really confusing for end users.
Without a merge mechanism, I would just ignore the etag and always overwrite.
Finally, you should also worry about the schedule-tag of the event (see https://www.rfc-editor.org/rfc/rfc6638#section-3.2.10 ).
Also iCal file should contain SEQUENCE number (incremented on each edit) which is more important that date of edit. By comparing SEQUENCE at least you may decide which edit is newer if its value is not equal for both parties.