Better way to "mutex" than with a .lock file over the network? - powershell

I have a small setup consisting of n clients (CL0, CL1, ... CLn) that access a windows share on the server.
On that server a json file holds important data that needs to be read- and writable by all players in the game. It holds key value pairs that are constantly read and changed:
{
"CurrentActive": "CL1",
"DataToProcess": false,
"NeedsReboot": false,
"Timestamp": "2020-05-25 16:10"
}
I already got the following done with PowerShell:
if a client writes the file, a lock file is generated that holds the hostname and the timestamp of the access. After the access, the lock file is removed. Each write "job" first checks if there is a lockfile and if the timestamp is still valid and will then conditionally write to the file after the lock is removed.
#Some Pseudo Code:
if(!Test-Path $lockfile){
gc $json
}else{
#wait for some time and try again
#check if lock is from my own hostname
#check if timestamp in lock is still valid
}
This works ok, but is very complex to build up, since I needed to implement the lock-mechanism and also a way to force remove the lock when the client is not able to remove the file for multiple reasons a.s.o. (and I am sure I also included some errors...). Plus, in some cases reading the file will return an empty one. I assume this is in the sweetspot during the write of the file by another client, when it is flushed and then filled with the new content.
I was looking for other options such as mutex and it works like a charm on a single client with multiple threads, but since relying on SafeHandles in the scope of one system, not with multiple clients over the network. Is it possible to get mutex running over the network with
I also stumbled about AlphaFS which would allow me to do transactional processing on the filesystem, but that doesn't fix the root cause that multiple clients access one file at the same time.
Is there a better way to store the data? I was thinking about the Windows Registy but I could not find anything on mutex there.
Any thoughts highly appreciated!

Related

How long does a wormhole file transfer persist

I am trying to use magic-wormhole to receive a file.
My partner and I are in different time zones, however.
If my partner types wormhole send filename, for how long will this file persist (i.e. how much later can I type wormhole receive keyword and still get the file)?
From the "Timing" section in the docs:
The program does not have any built-in timeouts, however it is expected that both clients will be run within an hour or so of each other ... Both clients must be left running until the transfer has finished.
So... maybe? Consider using some cloud storage instead, depending on the file. You could also encrypt it before uploading it to cloud storage if the contents of the file is private.

Creating an atomic process for a netconf edit-config request

I am creating a custom system that, when a user submits a netconf edit-config, it will initiate a set of actions in my system that will atomically alter the configuration of our system and then submit a notification to the user of its success or failure.
Think of it as a big SQL transaction that, at the end, either commits or rolls back.
So, steps
User submits an edit-config
System accepts config and works to implement this config
If the config is successful, sends by a thumbs up response (not sure the formal way of doing this)
If the config is a failure, sends by a thumbs down response (and I will have to make sure the config is rolled back internally)
All this is done atomically. So, if a user submits two configs in a row, they won't conflict with each other.
Our working idea (probably not the best one) to implement this was to go about this by accepting the edit-config and then, within sysrepo, we would edit parts of our leafs with the success or failure flags and they would happen within the same session as the initial change. We were hoping this would keep everything atomic; by doing edits outside of the session, multiple configuration changes could conflict with each other.
We weren't sure to go about this with pure netconf or to leverage sysrepo directly. We noticed all these plugins/bindings made for sysrepo and figured those could be used directly to talk to our datastore.
But that said, our working idea is most likely not best-practice approach. What would be the best way to achieve this?
Our system is:
netopeer 1.1.27
sysrepo 1.4.58
libyang 1.0.167
libnetconf2 1.1.24
And our yang file is
module rxmbn {
namespace "urn:com:zug:rxmbn";
prefix rxmbn;
container rxmbn-config {
config true;
leaf raw {
type string;
}
leaf raw_hashCode {
type int32;
}
leaf odl_last_processed_hashCode {
type int32;
}
leaf processed {
type boolean;
default "false";
}
}
}
Currently we can:
Execute an edit-config to netopeer server
We can see the new config register in the sysrepo datastore
We can capture the moment sysrepo registers the data via sysrepo's API
But we are having problems
Atomically editing the datastore during the update session (due to locks, which is normal. In fact, if there is no way to edit during an update session, that is fine and not necessary. The main goal is the next bullet)
Atomically reacting to the new edit-config and responding to the end user
We are all a bit new to netconf and yang, so I am sure there is some way to leverage the notification api or event api either through the netopeer session or sysrepo, we just don't know enough yet.
If there are any examples or implementation advice to create an atomic transaction for this, that'd be really useful.
I know nothing of sysrepo so this is from a NETCONF perspective.
NETCONF severs process requests serially within a single session in a request-response fashion, meaning that everything you do within a single NETCONF session should already be "atomic" - you cannot send two requests and have them applied in reverse order or in parallel no matter what you do. A well behaved client would also wait for each response from the server before sending a new request, especially if all updates must execute successfully and in specific order. The protocol also defines no way to cancel a request already sent to a server.
If you need to prevent other sessions from modifying a datatstore while another session is performing a multi- edit-config, you use <lock> and <unlock> NETCONF operations to lock the entire datastore. There is also RFC5717 and partial lock, which would only lock a specific branch of the datastore.
Using notifications to report success of an <edit-config> would be highly unusual - that's what <rpc-reply> and <rpc-error> are there for within the same session. You would use notifications to inform other sessions about what's happening. In fact, there are standard base notifications for config changes.
I suggest reading the entire RFC6241 before proceeding further. There are things like candidate datastores, confirmed-commits, etc. you should know about.
Which component are you developing? Netconf client/manager or Netconf server?
In general, the Netconf server should implement individual Netconf RPC operations in an atomic way.
When a Netconf client wants to perform a set of operations in an atomic way, it should follow the procedure explained in Apendix E.1 in RFC 6241.

java imap, performance issues, fetching all mails

I will use java mail api to handle mails like thunderbird etc. I have to fetch mails having 1000 messages. My design will be: When user performs a synch on a folder, i will get all uids of the messages in the folder:
Message[] msgs = ufolder.getMessagesByUID(1, UIDFolder.LASTUID);
// Use a suitable FetchProfile
FetchProfile fp = new FetchProfile();
fp.add(FetchProfile.Item.ENVELOPE);
fp.add(FetchProfile.Item.FLAGS);
I wil then compare the list of uids with the list stored in my db.
For the deleted ones, for example a message is not in the folder but in the db, i will mark it as deleted.
For the new ones, for example a message is in the folder but not in the db, i will mark as possible new. But, because messageuids are not safe (can be changed by the mail server on some cases), for the new mails, i will use additioanlly a custom hash value build from message id in the header + subject + receivedate and build a md5 hash. Only for the possible new mails i will use this hash and catch new mails.
For the moved messages, because their uids will be changed in the new folder, it will be flagged as deleted in the first and will be a new message in new folder, but the message will have same custom hash value becaue message id in the header and other properties will remain same duing the movement.
Question about performance issue: On each click on folder (folder synch) i will do the compare operation of all uids in the folder with the local uid list stored in the db to learn the deleted ones. I could not find another better way to accomplish this. As you know, thunderbird catches immediately a deleted message without relogin, even if the folder is large and the deleted message is very old (5 years). I think thunderbird also compares all message uids in that folder with a list stored locally.
How can i implement a better mechanism for the synch for a better performance? Does thunderbird apply a different approach? How can thunderbird
accomplish it so quickly?
If we were interested only for the new messages, i could have kept last stored uid and only compare the new messages later than that, but for the deleted ones, i already have to compare full folder. Additionally, UIDNEXT value is always -1 in my mail server, if it were set correctly, it will not help to get deleted ones again, a full compare is a must i think, am I wrong?
Note: I canot not use or add message listeners because the appliaction is server-client based and the mail handling task is on the server side and we do not support threads listeners etc. The events should be triggered from the client and the request is being processed on ther server and a response is returned and client handles the response on the gui.
What you want is called condstore or quick resync, RFC7162 in both cases. That's what Thunderbird uses.
That's a pair of extensions support commands like "give me all the UIDs that have changed since last time I connected", "tell me what's been deleted" and so on.
If you can't use threads to listen for these events from the mail server, your options are very limited. Probably the best thing you can do is limit the resynchronization to the messages that are visible to the client.

Avoid duplicate POSTs with REST

I have been using POST in a REST API to create objects. Every once in a while, the server will create the object, but the client will be disconnected before it receives the 201 Created response. The client only sees a failed POST request, and tries again later, and the server happily creates a duplicate object...
Others must have had this problem, right? But I google around, and everyone just seems to ignore it.
I have 2 solutions:
A) Use PUT instead, and create the (GU)ID on the client.
B) Add a GUID to all objects created on the client, and have the server enforce their UNIQUE-ness.
A doesn't match existing frameworks very well, and B feels like a hack. How does other people solve this, in the real world?
Edit:
With Backbone.js, you can set a GUID as the id when you create an object on the client. When it is saved, Backbone will do a PUT request. Make your REST backend handle PUT to non-existing id's, and you're set.
Another solution that's been proposed for this is POST Once Exactly (POE), in which the server generates single-use POST URIs that, when used more than once, will cause the server to return a 405 response.
The downsides are that 1) the POE draft was allowed to expire without any further progress on standardization, and thus 2) implementing it requires changes to clients to make use of the new POE headers, and extra work by servers to implement the POE semantics.
By googling you can find a few APIs that are using it though.
Another idea I had for solving this problem is that of a conditional POST, which I described and asked for feedback on here.
There seems to be no consensus on the best way to prevent duplicate resource creation in cases where the unique URI generation is unable to be PUT on the client and hence POST is needed.
I always use B -- detection of dups due to whatever problem belongs on the server side.
Detection of duplicates is a kludge, and can get very complicated. Genuine distinct but similar requests can arrive at the same time, perhaps because a network connection is restored. And repeat requests can arrive hours or days apart if a network connection drops out.
All of the discussion of identifiers in the other anwsers is with the goal of giving an error in response to duplicate requests, but this will normally just incite a client to get or generate a new id and try again.
A simple and robust pattern to solve this problem is as follows: Server applications should store all responses to unsafe requests, then, if they see a duplicate request, they can repeat the previous response and do nothing else. Do this for all unsafe requests and you will solve a bunch of thorny problems. Repeat DELETE requests will get the original confirmation, not a 404 error. Repeat POSTS do not create duplicates. Repeated updates do not overwrite subsequent changes etc. etc.
"Duplicate" is determined by an application-level id (that serves just to identify the action, not the underlying resource). This can be either a client-generated GUID or a server-generated sequence number. In this second case, a request-response should be dedicated just to exchanging the id. I like this solution because the dedicated step makes clients think they're getting something precious that they need to look after. If they can generate their own identifiers, they're more likely to put this line inside the loop and every bloody request will have a new id.
Using this scheme, all POSTs are empty, and POST is used only for retrieving an action identifier. All PUTs and DELETEs are fully idempotent: successive requests get the same (stored and replayed) response and cause nothing further to happen. The nicest thing about this pattern is its Kung-Fu (Panda) quality. It takes a weakness: the propensity for clients to repeat a request any time they get an unexpected response, and turns it into a force :-)
I have a little google doc here if any-one cares.
You could try a two step approach. You request an object to be created, which returns a token. Then in a second request, ask for a status using the token. Until the status is requested using the token, you leave it in a "staged" state.
If the client disconnects after the first request, they won't have the token and the object stays "staged" indefinitely or until you remove it with another process.
If the first request succeeds, you have a valid token and you can grab the created object as many times as you want without it recreating anything.
There's no reason why the token can't be the ID of the object in the data store. You can create the object during the first request. The second request really just updates the "staged" field.
Server-issued Identifiers
If you are dealing with the case where it is the server that issues the identifiers, create the object in a temporary, staged state. (This is an inherently non-idempotent operation, so it should be done with POST.) The client then has to do a further operation on it to transfer it from the staged state into the active/preserved state (which might be a PUT of a property of the resource, or a suitable POST to the resource).
Each client ought to be able to GET a list of their resources in the staged state somehow (maybe mixed with other resources) and ought to be able to DELETE resources they've created if they're still just staged. You can also periodically delete staged resources that have been inactive for some time.
You do not need to reveal one client's staged resources to any other client; they need exist globally only after the confirmatory step.
Client-issued Identifiers
The alternative is for the client to issue the identifiers. This is mainly useful where you are modeling something like a filestore, as the names of files are typically significant to user code. In this case, you can use PUT to do the creation of the resource as you can do it all idempotently.
The down-side of this is that clients are able to create IDs, and so you have no control at all over what IDs they use.
There is another variation of this problem. Having a client generate a unique id indicates that we are asking a customer to solve this problem for us. Consider an environment where we have a publicly exposed APIs and have 100s of clients integrating with these APIs. Practically, we have no control over the client code and the correctness of his implementation of uniqueness. Hence, it would probably be better to have intelligence in understanding if a request is a duplicate. One simple approach here would be to calculate and store check-sum of every request based on attributes from a user input, define some time threshold (x mins) and compare every new request from the same client against the ones received in past x mins. If the checksum matches, it could be a duplicate request and add some challenge mechanism for a client to resolve this.
If a client is making two different requests with same parameters within x mins, it might be worth to ensure that this is intentional even if it's coming with a unique request id.
This approach may not be suitable for every use case, however, I think this will be useful for cases where the business impact of executing the second call is high and can potentially cost a customer. Consider a situation of payment processing engine where an intermediate layer ends up in retrying a failed requests OR a customer double clicked resulting in submitting two requests by client layer.
Design
Automatic (without the need to maintain a manual black list)
Memory optimized
Disk optimized
Algorithm [solution 1]
REST arrives with UUID
Web server checks if UUID is in Memory cache black list table (if yes, answer 409)
Server writes the request to DB (if was not filtered by ETS)
DB checks if the UUID is repeated before writing
If yes, answer 409 for the server, and blacklist to Memory Cache and Disk
If not repeated write to DB and answer 200
Algorithm [solution 2]
REST arrives with UUID
Save the UUID in the Memory Cache table (expire for 30 days)
Web server checks if UUID is in Memory Cache black list table [return HTTP 409]
Server writes the request to DB [return HTTP 200]
In solution 2, the threshold to create the Memory Cache blacklist is created ONLY in memory, so DB will never be checked for duplicates. The definition of 'duplication' is "any request that comes into a period of time". We also replicate the Memory Cache table on the disk, so we fill it before starting up the server.
In solution 1, there will be never a duplicate, because we always check in the disk ONLY once before writing, and if it's duplicated, the next roundtrips will be treated by the Memory Cache. This solution is better for Big Query, because requests there are not imdepotents, but it's also less optmized.
HTTP response code for POST when resource already exists

NSMutableURLRequest on succession of another NSMutableURLRequest's success

Basically, I want to implement SYNC functionality; where, if internet connection is not available, data gets stored on local sqlite database. Whenever, internet connection is available, SYNC gets into the action.
Now, Say for example; 5 records are stored locally, and then internet connection is available. I want the server to be updated. So, What I do currently is:
Post first record to the server.
Wait for the success of first request.
Post local NSNotification to routine, that the first record has been updated on server & now second request can go.
The routine fires the second post request on server and so on...
Question: Is this approach right and efficient enough to implement SYNC functionality; OR anything I should change into it ??
NOTE: Records to be SYNC will have no limit in numbers.
Well it depends on the requirements on the data that you save. If it is just for backup then you should be fine.
If the 5 records are somehow dependent on each other and you need to access this data from another device/application you should take care on the server side that either all 5 records are written or none. Otherwise you will have an inconsistent state if only 3 get written.
If other users are also reading / writing those data concurrently on the server then you need to implement some kind of lock on all records before writing and also decide how to handle conflicts when someone attempts to overwrite somebody else changes.