I have 2 closed-source application that must share the same data at some point. Both uses REST APIs.
An actual example are helpdesk tickets, they can be created on both applications and i need to update the data on one application when the user adds a new ticket/closes a ticket on the other application and vice versa.
Since is closed-source I can't really modify che code.
I was thinking I can create a third application that every 5 minutes or so, list both applications' tickets for differences on the precedent call, and if the data is different from the precedent call it updates the other application too.
Is there a better way of doing this?
With closed-source applications it's nearly impossible to get something out of them, unless they have some plugin-based setup that you can hook into.
The most efficient way in terms of costs would be to have the first application publish a message on a queue, or call a web-hook that you set, whenever the event is triggered. But as I mentioned, the application needs to support that.
So yeah, your solution is pretty much everything you can do for now, but keep in mind the challenges that you may encounter over time:
What if the results of both APIs are too large to be compared directly? Maybe you need to think about paging the results.
What if your app crashes and you loose the previous state? You need to somehow back it up in an external source
How often you should poll the API to make sure you're getting the updates you need, while keeping a good performance for the existing traffic?
I am working on a desktop application that requires synchronization between several clients. Basically, a group of people (let's say between 2 and 10) all run the same application. One of them hosts a server and the other clients connect to that server. The client that hosts the server also connects to his own server.
The applications should stay synchronized between all clients, meaning all clients see the same data in the application. Specifically, the data in question I can define in two separate forms:
A simple property with a certain value (this value must stay synchronized)
A list of properties (the items in the list and their values must stay synchronized)
Simple examples of (1) could be: which item in a list does the client currently have selected, and what's the current location of the client's mouse pointer within the application window. These properties keep changing continuously but the number of these properties is constant and does not grow (e.g. defined during design time).
An example of (2) could be a list of chat messages. These lists will grow during runtime with no way to predict how many items there will be.
Here is an example code in C# for the state, client and chat messages:
public class State
{
// A single value shared between all clients
public int SimpleInteger {get;set;}
// List of connected clients and their individual states
public List<Client> Clients {get;set;}
// List of chat messages
public List<ChatMessage> Messages {get;set;}
}
public class Client
{
public string ClientId {get;set;}
public string Username {get;set;}
public ClientState ClientState {get;set;}
}
public class ClientState
{
public string ClientId {get;set;}
public int SelectedIndex {get;set;}
public int MouseX {get;set;}
public int MouseY {get;set;}
}
public class ChatMessage
{
public string ClientId {get;set;}
public string Message {get;set;}
}
I've been working on this on and off for a long time but whatever kind of state synchronization I came up with, it never worked well.
When I search for solutions, I only ever find solutions for games, but those are not very helpful because my requirements are different:
I cannot deal with "dropped updates", I cannot predict (interpolate or extrapolate) what the other clients are doing. Every client needs to receive every update to stay in sync.
On the other hand, I don't care about lag (within reason). It is fine if I see the updates of other client with about a second delay.
When a new client connects (or reconnects), a large portion of the state must be transfered (for example: the list of chat messages from example 2). Each client is required to know about the entire history of the chat so this must be downloaded when a client connects.
My current solution can be summarized as follows:
The server keeps track of the state, e.g. the source of truth.
The state contains the properties that require synchronizing.
The state also contains a list of connected users (and their usernames etc).
Clients also each keep a local copy of the state, which they can act upon immediately. For example, they update their mouse position in their local state continously.
Whenever a client updates his local state, this update is sent to the server.
Potential exceptions here are things that change too fast such as the mouse position, those I will only send in regular intervals.
The server also updates the common "source of truth" state.
Finally, the server updates all other clients with the new updated state.
The last two steps are where I'm struggling. I can think of two methods to synchronize the state, one is easy but probably not efficient and the other is efficient but prone to errors.
The server simply sends the entire state to all clients.
As soon as the server receives an update from the client, the update is applied to the state and the new state is broadcasted. Every other client replaces their local state.
I feel this will probably work, but the state can grow in size quickly due to the "list" items (for example chat messages). In my previous attempts, this quickly became a problem and sending the state back become much too slow.
The server re-sends the same update (that it received) to all other clients.
Each client then only applies the new update to their state locally to sync back with the server.
This is probably much more efficient and sending the entire state is only necessary when a client connects.
However, in the past I frequently ran into desync issues where clients were no longer in sync. I don't really know what caused it, probably conflicts between messages (for example server telling the client to update a value in the state, but the client just updated his local value, which has precedence?). After this happens, everything went completely wrong as the updates are now being applied to two different states and have different outcomes.
I'm looking for some guidance on general concepts on how to achieve this. I'm using several messaging libraries to achieve the actual communication between client and server and that part is not an issue I think. I can make sure in these libraries that every message is received for example (though I'm not sure if the order is guaranteed). Like I said before, lag is not an issue, but I must guarantee every state update is received both by the server and by every other client.
Any help would be great! Thanks.
This is a hard problem and there are enough tricky areas that I wouldn't want to build this myself. Authentication, conflicting updates, API management, network outages, single point of failure, and local persistence come to mind.
If you're up for using a cloud-based solution, Google Cloud Firestore takes care of those tricky areas and does what you need:
Clients save data to the database, by creating, updating, or deleting records. Example code.
Whenever a record is created, updated, or deleted, all clients get realtime notifications. Example code.
(After you follow the links above, make sure you click C# above the code boxes to see the C# code).
This is a complicated issue, with many moving parts, as you seem to understand. As I've been researching this, I've read a couple comments on questions like this one on a variety of Q&A sites, stating this kind of thing is a project all on it's own.
Disclaimer: I haven't done this myself, so I don't know how well this would work, but maybe you can take my suggestions and work with them, if you haven't already done so. I've worked on projects where this was implemented, but I wasn't part of that implementation directly.
Connection
Since you haven't said which library you are using for the connection, I'm going to assume you are using websockets or something similar. If not, I suggest you move to something like websockets. It allows for a (near) constant connection between client and server so that data can be pushed both directions, avoiding the client from having to poll and pull the data. The link below seems to have a decent walk-though on how to do it, so I won't try to. Because links die, here's the first example code they give, which seems pretty simple.
using System.Net.Sockets;
using System.Net;
using System;
class Server {
public static void Main() {
TcpListener server = new TcpListener(IPAddress.Parse("127.0.0.1"), 80);
server.Start();
Console.WriteLine("Server has started on 127.0.0.1:80.{0}Waiting for a connection...", Environment.NewLine);
TcpClient client = server.AcceptTcpClient();
Console.WriteLine("A client connected.");
}
}
https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API/Writing_WebSocket_server
Client start up
Once you have a stable connection between server and client, you need to make sure the data is in sync. When the user starts the app, you can get the timestamp of the latest change in each table and compare that to the server. If they are exactly the same, you have a somewhat reasonable expectation that the table hasn't changed. I'm assuming each table has a column containing the timestamp for the last edit made to the row.
For the tables that have changed, you can have the server send the new and updated rows to the client based on the client's "last changed timestamp".
Since the internet isn't 100% guaranteed to be connected, you will also need to keep track of the times the client has been connected vs. when they've been on the app (unless the app just won't work without being connected to the server). This information also needs to be sent to the server to compare to data changed during intervals where the client hasn't been connected.
Once timestamp matching has been done, you need to compare the row counts. If they match, you can more reasonably assume the tables are the same. If they aren't, you can see about matching ID/primary keys. There's a variety of different ways to do this, including 1:1 matching (which is slowest but most reliable), or you can do some math with the IDs (assuming numerical IDs) and try to see what's different in batches of 100 rows (for example). Idea: If adding the sorted, auto-increment integer IDs for the first 100 rows is the same on the client and the server, all those rows exist on both servers, but if it doesn't match, you can try the 1:1 match to see what's missing. Because this can be lengthy for large databases, you may want to track this type of sync in another table, so it doesn't need to be done all the time.
Instead, you may want a table to track all the data not sent to a client. This would require a confirmation that the data sent was correctly inserted into the client DB. This could also work on the client side to track what hasn't been sent to the server. Of course, this kind of thing can get cumbersome quickly, even if you're just tracking keys, table names, and timestamps. You can rack up millions of rows quickly, if you don't remove old data periodically. This is why I suggest tracking unsent data, so that anything that becomes "sent" is no longer tracked by this table and removed.
If you don't want to code and manage all that, you can try for a library that does it. There are a variety out there. Even Microsoft has one, but it's on extended support to only 1/1/2021. What happens after that, I doubt even Microsoft knows, but it gets you 1.25 years to come up with a different solution.
Creating Synchronization Providers With The Sync Framework
The Sync Framework can be used to build apps that synchronize data from any data store using any protocol over a network. We'll show you how it works and get you started building a custom sync provider.
https://learn.microsoft.com/en-us/previous-versions/sql/synchronization/mt490616(v=msdn.10)
https://support.microsoft.com/en-us/lifecycle/search?alpha=Microsoft%20Sync%20Framework%202.1
Normal runtime
Once you have your data synced on startup (or in the background after startup), you can simply send the data to the server normally, as in when the user makes changes. Since you'll have a websocket type connection, any changes the server gets from other clients will be able to be pushed to all the other clients.
As far as changing the data in real time in your app, you may have to be constantly polling your local/client DB for timestamp changes so the UI can be appropriately updated. There may be something within C# that does this for you or another library you can find.
Conclusion
At this point, I'm out of ideas. It seems reasonable to me this would work, even though it's a lot of work. Hopefully you can take what I have and use it as a foundation to your own ideas on how to accomplish your task. It seems there's a lot of work ahead of you, so good luck!
Footnote
As I'm currently the only answer after several days of it being unanswered, I'm going to assume no one else has anything better to suggest. If they do, I'd encourage them to make their own answer instead of complaining about mine. People tweaking this answer is expected, but please remember community standards when making comments.
I'm only answering this because I haven't seen anyone else do it on this or other sites. It's only been bits and disconnected pieces here & there, with people still not being able to make sense of it as a whole.
This and similar questions have been asked before on this site and closed as "too broad". If you feel this same way as a reader, please vote so on the Question not this answer.
There are several solutions to your problem.
You could use a BizTalk server out-of-the box. This may not be what you have in mind.
If you want something more home-brewed, you could use WCF (Windows Communication Foundation) with MSMQ (Microsoft Message Queue). This would give you guaranteed message delivery, and durable messages (if you want). You would not have to worry about lost connections, and other errors occurring during messages transmission.
You can go down another level and use direct TCP and UDP protocols to transmit messages. But now, you have to take care of more error cases.
Any SQL DBMS implements one important part of your problem statement: it maintains shared state. Consider what ACID promises:
Consistency. At any one instant, all clients reading from the database are guaranteed to see the same information.
Atomicity. The client updating the database can use as many steps as needed. When the transaction is committed, the data are changed entirely or not at all.
Isolation. The server gives each client the illusion of interacting with it alone. It handles concurrent updates, and updates the database as though the updates arrived serially.
You may not care about durability for this application.
The mediation among the clients is, for my money, the most useful feature of the DBMS for your application. That will save you work, and headaches. Another, non-obvious, benefit is that it can enforce consistency rules for the state information; that can be remarkably useful to prevent an obsolete/corrupt client from munging the shared state.
The second part of your problem statement is notifying 2-10 clients of changed state. There are any number of ways to do that.
Some DBMSs can access OS services from triggers. You could have an update trigger issue a notification. Alternatively, the updating client could do that.
The actual notification mechanism could be quite simple. Clients could connect to a server (that you write) and block on read(2). The server itself listens on a port for update notifications. On receipt of one, it repeats it to all connected clients. When the client's read request returns, it's time to query the database for the updated state, and post a new read.
To prevent a kind of "thundering herd" problem when several updates arrive back-to-back, when a client reads the update message, it could keep reading updates until EWOULDBLOCK, and only then query the DBMS. OTOH, if it's important to see the intermediate states (to see every update, not just the current state), the DBMS is perfectly capable of storing and providing all versions and distinguishing them with a timestamp or serial number.
If you don't want to use TCP sockets directly, you might prefer ZeroMQ.
In this design, each client has three connections: the DBMS, the read-notify socket, and (maybe) the server-notify socket. The server has N+1 connections, for N clients and one listening socket. You have no locks to implement, very little tracking of participation, no problems re-synchronizing, and short windows inconsistency among clients as each one acts on its notification.
I have a webapp using the Scala-based Scalatra web framework. The problem is, anytime the application is re-deployed, or anytime the app-server is rebooted, all session data is lost. This means (to name one downside) users must re-login every time we make an update to the site.
Some research reveals there are, apparently, "container-specific" ways to make sessions persist across app and server reboots (e.g., in the case of Tomcat), but this has two shortcomings:
If the app is not always deployed in the same container (and in the case of Scalatra, an embedded Jetty is used for dev purposes) then I'll need separate configuration for each container.
Using a server-local configuration file is much more fickle -- it's likely to get lost in server migrations, and it won't be automatically available to each instance (e.g., to each developer) of the app, whereas something stored with the core application code is much easier to test, retain, and generally keep track of.
So, to sum up...
Is there a generic, container-neutral way to make sessions persistent? Even if only by overriding appropriate methods in the Java/Servlet stack and storing the session data manually?
Barring that, is there a way to store relevant configuration for multiple containers (e.g., for both Jetty and Tomcat) in my application code (web.xml or similar)?
Thanks -- any insights appreciated!
Ok, so I came to this company that recalled its software from an offshore, no-longer-extant entity. We all know the drill.
In looking at the nuts and bolts, I come across the 'lock' keyword. Googling, I find that Entity Framework does not support multi-threading.
My question is: How can I be 100% certain that the application is attempting to run in multiple threads? Is the existence of the 'lock' keyword enough?
Thanks.
If this is a ASP.NET/MVC web app and you have the lock keyword that is probably because the app is in IIS and IIS dispatches different user requests on different threads and therefore web app becomes multi-threaded.
In case of MVC - Controller is created per request and then it is processed on different thread. That leads to the need to lock something if two users at a time are going to access it.
If this is a desktop app and the lock is where data access happens it might be for similar purpose.
The lock keyword alone is not enough, they could be using it incorrectly after all. lock will just prevent more than one thread from entering the protected area at any one time. What is being protected by the lock? Data stored in a static variable is available to all users (threads) using the app and so should have controlled access.
I am writing an app for iOS that uses data provided by a web service. I am using core data for local storage and persistence of the data, so that some core set of the data is available to the user if the web is not reachable.
In building this app, I've been reading lots of posts about core data. While there seems to be lots out there on the mechanics of doing this, I've seen less on the general principles/patterns for this.
I am wondering if there are some good references out there for a recommended interaction model.
For example, the user will be able to create new objects on the app. Lets say the user creates a new employee object, the user will typically create it, update it and then save it. I've seen recommendations that updates each of these steps to the server --> when the user creates it, when the user makes changes to the fields. And if the user cancels at the end, a delete is sent to the server. Another different recommendation for the same operation is to keep everything locally, and only send the complete update to the server when the user saves.
This example aside, I am curious if there are some general recommendations/patterns on how to handle CRUD operations and ensure they are sync'd between the webserver and coredata.
Thanks much.
I think the best approach in the case you mention is to store data only locally until the point the user commits the adding of the new record. Sending every field edit to the server is somewhat excessive.
A general idiom of iPhone apps is that there isn't such a thing as "Save". The user generally will expect things to be committed at some sensible point, but it isn't presented to the user as saving per se.
So, for example, imagine you have a UI that lets the user edit some sort of record that will be saved to local core data and also be sent to the server. At the point the user exits the UI for creating a new record, they will perhaps hit a button called "Done" (N.B. not usually called "Save"). At the point they hit "Done", you'll want to kick off a core data write and also start a push to the remote server. The server pus h won't necessarily hog the UI or make them wait till it completes -- it's nicer to allow them to continue using the app -- but it is happening. If the update push to server failed, you might want to signal it to the user or do something appropriate.
A good question to ask yourself when planning the granularity of writes to core data and/or a remote server is: what would happen if the app crashed out, or the phone ran out of power, at any particular spots in the app? How much loss of data could possibly occur? Good apps lower the risk of data loss and can re-launch in a very similar state to what they were previously in after being exited for whatever reason.
Be prepared to tear your hair out quite a bit. I've been working on this, and the problem is that the Core Data samples are quite simple. The minute you move to a complex model and you try to use the NSFetchedResultsController and its delegate, you bump into all sorts of problems with using multiple contexts.
I use one to populate data from your webservice in a background "block", and a second for the tableview to use - you'll most likely end up using a tableview for a master list and a detail view.
Brush up on using blocks in Cocoa if you want to keep your app responsive whilst receiving or sending data to/from a server.
You might want to read about 'transactions' - which is basically the grouping of multiple actions/changes as a single atomic action/change. This helps avoid partial saves that might result in inconsistent data on server.
Ultimately, this is a very big topic - especially if server data is shared across multiple clients. At the simplest, you would want to decide on basic policies. Does last save win? Is there some notion of remotely held locks on objects in server data store? How is conflict resolved, when two clients are, say, editing the same property of the same object?
With respect to how things are done on the iPhone, I would agree with occulus that "Done" provides a natural point for persisting changes to server (in a separate thread).