Zookeeper: how not to lose a watch?

Zookeeper: how not to lose a watch? - apache-zookeeper

Watches are one time triggers; if you get a watch event and you want
to get notified of future changes, you must set another watch.
Lets assume, there is a znode which's data is updated every 5-100 sec. I want to maintain a local copy of it's data. So my algorithm is:
call get(znode, set_watch=true) to set a watch
on znode's data changed:
a step with potential programmer's mistake
local_copy = get(znode, set_watch=true) to get local copy and set another watch
So a single exception on step 2.1 makes step 2.2 be skipped which means another watch is not set and all future updates will be lost.
What is the general way to write robust data-changed listener? My current dirty workaround is to set additional watches by timer.

Related

Swift: How can I change backgroundFetchIntervalMinimum with BGAppRefreshTask

I am changing my deprecated function setMinimumBackgroundFetchInterval() to new one BGAppRefreshTask() but every example is using actual-time like this:
request.earliestBeginDate = Date(timeIntervalSinceNow: 15 * 60) // Fetch no earlier than 15 minutes from now
but my current source used backgroundFetchIntervalMinimum like...
UIApplication.shared.setMinimumBackgroundFetchInterval(UIApplication.backgroundFetchIntervalMinimum)
so how can I apply backgroundFetchIntervalMinimum with BGAppRefreshTask?

Setting the request’s earliestBeginDate is equivalent to the old minimum fetch interval. Just use earliestBeginDate like you have seen in those other questions. And when your app runs a background fetch, you just schedule the next background fetch with its own earliestBeginDate at that point. That is how BGAppRefreshTask works.
If you are asking what is equivalent to UIApplication.backgroundFetchIntervalMinimum, you can just set earliestBeginDate to nil. The documentation tells us that:
Specify nil for no start delay.
Setting the property indicates that the background task shouldn’t start any earlier than this date. However, the system doesn’t guarantee launching the task at the specified date, but only that it won’t begin sooner.
The latter caveat applies to backgroundFetchIntervalMinimum, too. Background fetch, regardless of which mechanism you use, is at the discretion of the OS. It attempts to balance the app’s desire to have current data ready for the user the next time they launch the app with other considerations (such as preserving the device battery, avoiding excessive use of the cellular data plan, etc.).
The earliestBeginDate (like the older backgroundFetchIntervalMinimum) is merely a mechanism whereby you can give the OS additional hints to avoid redundant fetches. E.g., if your backend only updates a resource every twelve hours, there is no point in performing backend refreshes much more frequently than that.

How to Load Only One Track at a Time in Liquidsoap

I have a MySQL database that stores all my tracks and their associated information. One of the tables in the database is a queue table from which I pull a track for Liquidsoap to play. I am providing those tracks to play with Liquidsoap by using the request.dynamic.list.
def get_track() =
# Get the first line of my external process
result = list.hd(default="", get_process_lines(scripts ^ "get_track.py"))
print(result)
# Create and return a request using this result
[request.create(result)]
end
# Create the source
sourcetrack = request.dynamic.list(id="play_queue", conservative=false, get_track)
The get_track.py script retrieves a record from a queue table in the database.
I noticed that Liquidsoap will grab two tracks when in starts up. Two get "accepted" and one is "prepared."
Is there a way to get Liquidsoap to only accept one track at a time and wait to accept the next one only when reaching near the end of the currently playing track?
I also have scheduled programs that get added to the queue table in the database and when this occurs, all tracks are cleared from the queue table in the database and the program is then added to the queue table.
Since Liquidsoap appears to have a track already loaded in its queue while playing the "prepared" track, is there a way to remove that track so Liquidsoap will not play that track next, but rather call again the get_track.py script to load new track from queue table in database?

Liquidsoap always prepares stream's next items in advance, and it's a fundamental principle of its scheduler. This allows to start a download before playing the downloaded track, for example. As long as you are using request.dynamic.list, the called script must take care of this. In other words, you can't only rely on clock time to evaluate the track to return.
As far as I understand your use case you might prefer using a request.queue source, and have your script push each request on time via the telnet server.

How to mark a object for deletion in x days time?

I have a regional object store. I would like to be able to tell a particular object that I want you deleted in 5 days time from now.
How do you suggest I implement?
I don't really want to keep track of the object in a database, and based on time send delete commands as a separate process. Is there any tag that could be set to get deletion to occur at a later time (from now, not a specific time in the past)?

There's no functionality built into Google Cloud Storage to do this.
You can configure Lifecycle Management to delete objects according to a number of criteria (including age) - but deleting at a particular date in the future isn't one of the supported conditions and in fact there's no guarantee that a lifecycle condition will run the same day the condition becomes true. Instead you would have to implement this functionality yourself (e.g., in a Compute Engine or App Engine implementation).

Can watchman send why a file changed?

Is watchman capable of posting to the configured command, why it's sending a file to that command?
For example:
a file is new to a folder would possibly be a FILE_CREATE flag;
a file that is deleted would send to the command the FILE_DELETE flag;
a file that's modified would send a FILE_MOD flag etc.
Perhaps even when a folder gets deleted (and therefore the files thereunder) would send a FOLDER_DELETE parameter naming the folder, as well as a FILE_DELETE to the files thereunder / FOLDER_DELETE to the folders thereunder
Is there such a thing?

No, it can't do that. The reasons why are pretty fundamental to its design.
The TL;DR is that it is a lot more complicated than you might think for a client to correctly process those individual events and in almost all cases you don't really want them.
Most file watching systems are abstractions that simply translate from the system specific notification information into some common form. They don't deal, either very well or at all, with the notification queue being overflown and don't provide their clients with a way to reliably respond to that situation.
In addition to this, the filesystem can be subject to many and varied changes in a very short amount of time, and from multiple concurrent threads or processes. This makes this area extremely prone to TOCTOU issues that are difficult to manage. For example, creating and writing to a file typically results in a series of notifications about the file and its containing directory. If the file is removed immediately after this sequence (perhaps it was an intermediate file in a build step), by the time you see the notifications about the file creation there is a good chance that it has already been deleted.
Watchman takes the input stream of notifications and feeds it into its internal model of the filesystem: an ordered list of observed files. Each time a notification is received watchman treats it as a signal that it should go and look at the file that was reported as changed and then move the entry for that file to the most recent end of the ordered list.
When you ask Watchman for information about the filesystem it is possible or even likely that there may be pending notifications still due from the kernel. To minimize TOCTOU and ensure that its state is current, watchman generates a synchronization cookie and waits for that notification to be visible before it responds to your query.
The combination of the two things above mean that watchman result data has two important properties:
You are guaranteed to have have observed all notifications that happened before your query
You receive the most recent information for any given file only once in your query results (the change results are coalesced together)
Let's talk about the overflow case. If your system is unable to keep up with the rate at which files are changing (eg: you have a big project and are very quickly creating and deleting files and the system is heavily loaded), the OS can't fit all of the pending notifications in the buffer resources allocated to the watches. When that happens, it blows those buffers and sends an overflow signal. What that means is that the client of the watching API has missed some number of events and is no longer in synchronization with the state of the filesystem. If that client is maintains state about the filesystem it is no longer valid.
Watchman addresses this situation by re-examining the watched tree and synthetically marking all of the files as being changed. This causes the next query from the client to see everything in the tree. We call this a fresh instance result set because it is the same view you'd get when you are querying for the first time. We set a flag in the result so that the client knows that this has happened and can take appropriate steps to repair its own state. You can configure this behavior through query parameters.
In these fresh instance result sets, we don't know whether any given file really changed or not (it's possible that it changed in such a way that we can't detect via lstat) and even if we can see that its metadata changed, we don't know the cause of that change.
There can be multiple events that contribute to why a given file appears in the results delivered by watchman. We don't them record them individually because we can't track them with unbounded history; imagine a file that is incrementally being written once every second all day long. Do we keep 86400 change entries for it per day on hand and deliver those to our clients? What if there are hundreds of thousands of files like this? We'd have to truncate that data, and at that point the loss in the data reduces how well you can reason about it.
At the end of all of this, it is very rare for a client to do much more than try to read a file or look at its metadata, and generally speaking, they want to do that only when the file has stopped changing. For this use case, watchman-wait, watchman-make and trigger all have the concept of a settle period that causes the change notifications to be delayed in delivery until after the filesystem has stopped changing.

What does edge-based and level-based mean?

What does "level-based" and "edge-based" mean in general?
I read "In other words, the system's behavior is level-based rather than edge-based" from kubernetes documentation:
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/api-conventions.md
with Google, I only find:
http://www.keil.com/forum/9423/edge-based-vs-level-based-interrupt/
Thank you.

It also has a more general definition (at least the way we tend to use it in the documentation). A piece of logic is "level based" if it only depends on the current state. A piece of logic is "edge-based" if it depends on history/transitions in addition to the current state.
"Level based" components are more resilient because if they crash, they can come back up and just look at the current state. "Edge-based" components must store the history they rely on (or depend on some other component that stores it), so that when they come back up they can look at the current state and the history. Also, if there is some kind of temporary network partition and an edge-based component misses some of the updates, then it will compute the wrong output.
However, "level based" components are usually less efficient, because they may need to scan a lot of state in order to compute an output, rather than just reading deltas.
Many components are a mixture of the two.
Simple example: You want to build a component that reports the number of pods in READY state. A level-based implementation would fetch all the pods from etcd (or the API server) and count. An edge-based implementation would do that once at startup, and then just watch for pods entering and exiting READY state.

I'd say they explained it pretty well on the site:
When a new version of an object is POSTed or PUT, the "spec" is updated and available immediately. Over time the system will work to bring the "status" into line with the "spec". The system will drive toward the most recent "spec" regardless of previous versions of that stanza. In other words, if a value is changed from 2 to 5 in one PUT and then back down to 3 in another PUT the system is not required to 'touch base' at 5 before changing the "status" to 3.
So from that statement, we know that "level base" means a PUT request does not need to be satisfied if the primary goal does not require it; it's free to skip PUT requests when seen fit.
This makes me assume that "edge based" systems require every PUT request to be satisfied, even if some requests could be skipped without altering the final result, since that would be the alternative to skipping requests.
I'm no RESTful developer (you can see by my account activity). I could not find any source of information for these things anywhere else, so I'm going based on the explanation they gave, which seems pretty straight forward.

The Kubernetes API doesn't store a history of all the changes made to an object. The controller that is responsible for that object cannot reliably observe each change; it may only observe the current state of the object.
The term "level" means the current state of the object, and the term "edge" means a transition to a new state. Controllers are "level-based" because they cannot reliably observe the transitions.
From the API conventions:
When a new version of an object is POSTed or PUT, the spec is updated and available immediately. Over time the system will work to bring the status into line with the spec. The system will drive toward the most recent spec regardless of previous versions of that stanza. For example, if a value is changed from 2 to 5 in one PUT and then back down to 3 in another PUT the system is not required to 'touch base' at 5 before changing the status to 3. In other words, the system's behavior is level-based rather than edge-based. This enables robust behavior in the presence of missed intermediate state changes.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse